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ABSTRACT 


This  study  uses  regression  techniques  on  United  States  Military  Academy 
(USMA)  cadet/  candidate  data  in  order  to  develop  a  hardiness-prediction  model 
and  explore  retention  during  and  after  graduation  from  USMA. 

We  created  several  data  sets  using  42  variables  from  three  cohorts  (N= 
3,716)  and  analyzed  them  using  regression  techniques.  Preliminary  results 
showed  high  school  type  and  the  interaction  between  gender  and  parents’ 
education  level  as  significant.  Specifically,  private  religious  high  schools  and 
male  cadets  with  less-educated  fathers  are  positive  predictors  of  hardiness  (R2  = 
0.05). 

Model  quality  improved  in  subsequent  regressions  by  identifying  a  target 
population.  Among  varsity  football  players  (N=  149),  less-educated  mothers  and 
liberal  political  views  are  negative  predictors  of  hardiness  while  race  and  parents’ 
military  service  history  (African  Americans  with  fathers  who  served  in  the  military) 
and  prep  school  attendance  are  positive  predictors  of  hardiness  (R2  =  0.97). 

Logistic  regression  results  suggest  military,  physical,  and  academic 
performance  are  positive  predictors  of  USMA  retention  while  hardiness- 
challenge,  participation  in  varsity  athletics,  and  less-educated  fathers  are 
negative  predictors. 

Logistic  regression  results  identified  basic  branch  as  the  sole  positive 
predictor  of  U.S.  Army  officer  retention  beyond  a  USMA  graduates’  sixth  year  of 
active  federal  service.  Infantry  officers,  followed  by  military  police,  armor  and 
engineers,  remain  in  service  longer  (medical  corps  and  aviation  branch  officers 
excluded). 
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INTRODUCTION 


The  United  States  Military  Academy  (USMA  or  West  Point)  located  at 
West  Point,  New  York,  began  in  1802  for  the  purpose  of  training  military  officers 
in  leadership  and  engineering  and  provide  a  superb  four-year  education.  After 
two  centuries,  West  Point’s  education  now  focuses  on  the  leader  development  of 
cadets  in  academic,  military,  and  physical  domains,  all  underwritten  by 
adherence  to  a  code  of  honor  (United  States  Military  Academy  [USMA],  2013a). 
Today,  USMA  carries  out  its  founding  fathers’  legacy  in  an  unpredictable 
environment  of  chaos  by  preparing  young  officers  for  a  career  in  the  United 
States  Army. 

In  an  age  of  performance  refinement,  USMA,  like  most  premier 
institutions,  must  attract  agile  and  adaptable  leaders  capable  of  meeting  intense 
demands.  This  study  looks  into  one  of  the  hidden  treasures  of  performance 
measurement,  hardiness.  Hardiness  is  the  pattern  of  courage  and  motivation 
one  uses  to  determine  advantageous  performance  (Maddi,  Matthews,  Kelly, 
Villarreal,  &  White,  2012).  Assessed  during  a  cadet’s  first  summer,  hardiness 
accounts  for  the  difference  in  response  to  adversity  between  individuals  and  may 
be  a  tool  USMA  can  use  to  select  and  retain  quality  personnel. 

This  thesis  explores  archived  data  from  three  USMA  cohorts  in  order  to 
determine  their  relationship  to  hardiness.  Secondly,  we  investigate  the 
relationship  of  hardiness,  and  other  variables,  to  retention,  both  during  and  after 
West  Point.  The  goal  of  this  project  is  to  develop  a  mathematical  model  that 
accurately  predicts  hardiness  and  retention. 

A.  RESEARCH  OBJECTIVE 

In  concert  with  the  literature  review,  the  first  research  objective  uses  data 
obtained  from  the  USMA  Office  of  Institutional  Research  (OIR)  to  develop  a 
hardiness  predictor.  Previous  research  revealed  the  power  of  hardiness  in 
predicting  performance  across  multiple  domains  beyond  the  Five  Factor  Model 

1 


(FFM).  USMA  does  not  allow  personality  testing  as  a  selection  tool.  However, 
perhaps  we  may  discover  whether  pre-admission  information  can  predict  a 
candidate’s  hardiness.  For  this  objective,  we  defined  a  new  success  category 
with  hardiness  as  the  outcome  and  pre-admission  “predictor”  variables  from  OIR. 
The  desired  output  from  this  research  objective  is  a  linear  model,  useful  for 
predicting  future  hardiness  scores  for  cadet  candidates. 

The  second  research  objective  explores  the  relationship  between 
hardiness  and  retention.  Numerous  and  varied  circumstances  faced  by 
members  of  the  U.S.  military  influence  an  individual’s  decision  to  leave  or  stay  in 
the  armed  forces.  Although  we  test  additional  predictors,  a  rounded  investigation 
may  indicate  that,  regardless  of  circumstance,  hardiness  influences  retention. 

We  investigate  the  retention  research  objective  in  two  forms.  First, 
graduation  status  (i.e.,  whether  the  cadet  graduated  or  separated  from  USMA); 
second,  active  duty  status  (whether  the  U.S.  Army  retained  the  USMA  graduate 
beyond  his  or  her1  initial  service  obligation  or  suffered  loss). 

B.  BACKGROUND 

1.  Service  to  the  Nation — USMA’s  Mission 

A  proper  mission  statement  attracts  prospects,  guides  proponents,  and 
provides  a  means  for  an  organization  to  measure  performance.  Since  1925, 
USMA  and  Army  Regulations  documented  various  mission  statements  prepared 
by  its  leaders  to  communicate  West  Point’s  strategic  aim.  Past  USMA  mission 
statements  range  from  general  to  specific,  but  as  a  whole,  center  on  preparing 
the  Corps  of  Cadets  for  service  to  the  nation  in  the  capacity  of  Army  officers. 

The  current  mission  of  USMA  is: 

To  educate,  train,  and  inspire  the  Corps  of  Cadets  so  that  each 

graduate  is  a  commissioned  leader  of  character  committed  to  the 

1  Hereafter  we  use  “his”  in  reference  to  both  genders  to  limit  wordiness 
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values  of  Duty,  Honor,  Country;  and  prepared  for  a  career  of 
professional  excellence  and  service  to  the  Nation  as  an  officer  in 
the  United  States  Army.  (USMA,  2013a) 

2.  Life  as  a  Cadet 

a.  Admissions 

A  West  Point  cadet  is  a  volunteer  member  of  the  United  States 
Corps  of  Cadets  (USCC),  selected  through  a  rigorous  admissions  process,  who 
endures  a  47-month  experience  designed  to  prepare  him  to  lead  in  a  world  of 
complexity  and  uncertainty.  Cadets  comprise  of  prior-service  active  duty, 
Reserve  or  National  Guard  soldiers,  high  school  graduates  and  international 
military  officers  between  17  and  22  years  of  age. 

An  admission  into  USMA  requires  a  prospective  candidate  to 
exemplify  academic,  physical  and  social  history  prowess.  Each  appointee  must 
receive  a  congressional  or  service-connected  nomination  granted  by  the  Vice 
President  of  the  United  States,  U.S.  Senators  and  Representatives,  Delegates  of 
the  House  of  Representatives,  or  the  Secretary  of  the  Army,  as  well  as  governors 
or  commissioners  of  several  U.S.  territories  such  as  Samoa,  Puerto  Rico,  Guam, 
Mariana  Islands  (USMA,  2013b). 

b.  The  United  States  Corps  of  Cadets 

There  are  four  year  groups  (YG),  freshman  through  senior  class, 
each  broken  into  four  regiments  of  nine  cadet  companies  with  equal  numbers  of 
upper  and  lower  classmen  in  each  company.  In  addition  to  academic  and 
military  requirements,  the  upperclassmen  are  responsible  for  the  conduct  and 
training  of  underclassmen.  Plebes  (freshmen)  spend  their  entire  first  year 
memorizing  West  Point’s  history,  which  include  famous  quotes  from  famous 
graduates,  national  or  military  songs,  creeds  and  key  definitions.  Plebes  must 
also  memorize  current  events  and  recite  knowledge  assigned  to  them  by  their 
upper-class  chain-of-command. 
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All  cadets  participate  in  an  athletic  activity — intramurals,  club,  or 
corps  squad  (intercollegiate).  While  intramural  sports  remain  internal  to  West 
Point,  club  and  corps  squad  teams  travel  across  the  country  to  participate  in 
various  NCAA  or  university  contests.  Although  much  of  the  academic  military 
training  occurs  during  the  academic  year,  West  Point  dedicates  its  summers  to 
military  training.  As  one  can  imagine,  a  cadet’s  life  is  extremely  busy,  beginning 
his  day  as  early  as  0530  and  ending  about  midnight.  Graduation  from  USMA  is 
the  ultimate  qualification  necessary  to  become  an  Army  officer,  but  there  are 
prerequisite  qualifications  to  graduation,  one  of  which  is  job  qualification.  Job 
qualification  consists  of  three  categories:  admission,  graduation  and  officer. 

c.  USMA  Preparatory  School 

Cadets  who  attended  the  USMA  Preparatory  School  (USMAPS) 
hold  25  percent  of  leadership  positions  within  the  Corps  of  Cadets.  The  chief 
purpose  of  USMAPS  is  to  assist  in  preparing  high  school  graduates  and/or 
enlisted  personnel  from  the  active  duty,  Reserve,  or  National  Guard  force  for  the 
academic  rigors  of  West  Point.  Located  on  the  grounds  of  West  Point,  USMAPS 
conducts  operations  similar  to  the  Academy.  Upon  successful  completion  of  the 
one-year  program,  attendees  of  USMAPS  become  fourth-class  cadets 
(freshman)  at  West  Point. 

d.  Graduation 

A  cadet  is  responsible  for  upholding  the  cadet  honor  code  and 
passing  his  academic,  military  and  physical  events  prior  to  graduation.  The 
following  excerpt  from  USMA’s  academic  catalog  further  defines  graduation  from 
the  Academy: 

Regulations  for  the  United  States  Military  Academy  state  that 
cadets  of  the  First  Class  who  have  been  found  by  the  Academic 
Board  successfully  to  have  completed  the  course  of  instruction, 
including  academic,  military,  and  physical  education  and  training;  to 
have  maintained  the  standards  of  conduct;  and  to  possess  the 
moral  qualities,  traits  of  character  and  leadership  essential  for  a 
graduated  cadet;  shall  receive  a  diploma  signed  by  the 
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Superintendent,  the  Commandant  of  Cadets,  and  the  Dean  of  the 

Academic  Board;  and  shall  there  upon  become  a  graduate  of  the 

United  States  Military  Academy  with  a  degree  of  Bachelor  of 

Science.  (Office  of  the  Dean,  2010) 

3.  Performance  Measurement  at  USMA 

In  the  land  of  leadership,  performance  is  king.  To  train  and  generate  high- 
caliber  officers  to  lead  and  fight  our  nation’s  wars,  an  accurate  measurement  of 
their  performance  is  necessary.  In  fact,  no  organization  claims  legitimacy  without 
first  producing  a  quality  product  to  the  customers’  satisfaction.  Nevertheless, 
what  exactly  is  effective  performance? 

Cook  (2009)  explains  that  when  someone  digs  deeper  into  what 
constitutes  effective  performance,  questions  arise  concerning  the  real  nature  of 
work  or  the  true  purpose  of  organizations.  Cook  wondered  if  we  measure 
success  best  by  counting  objects  produced  or  subjectively  by  informal  opinion. 
Those  interested  in  attending  USMA  have  their  performance  measured  from  the 
moment  the  Admissions  Department  receives  their  applications  to  the  day  of  their 
graduation.  Before  an  appointment  to  USMA  is  granted,  a  “range  of  tests  are 
used  to  assess  a  range  of  attributes”  (Cook,  2009). 

4.  Life  as  an  Army  Officer 

After  a  successful  completion  of  the  academic,  physical  and  military 
requirements,  USMA  commissions  its  senior  class  into  the  United  States  Army  as 
Second  Lieutenants.  USMA  expects  their  commissioned  leaders  to  develop  a 
“capacity  to  lead.” 


a.  Officer  Qualifications 

Before  leaving  USMA,  the  senior  (“firstie”)  class  assesses  into  one 
of  the  seventeen  Army  career  fields  (“branches”).  Order  of  merit  (e.g.,  class 

rank)  determines  branch  choice-a  firstie  may  choose  any  branch  for  which  he 


meets  the  basic  requirements.  For  example,  seniors  wishing  to  select  the 

aviation  branch  must  pass  a  comprehensive  flight  physical,  and  the  Army  flight 
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aptitude  selection  test.  There  are  similar  requirements  for  cadets  wishing  to 
select  the  medical  service  corps  as  their  branch  in  order  to  become  doctors  or 
nurses. 

A  litmus  test  of  USMA’s  effectiveness  as  an  institution  is  the 
graduate’s  ability  to  achieve  various  officer  qualifications  after  leaving  the 
Academy.  The  Basic  Officer  Leadership  Course  (BOLC),  for  example,  certifies  a 
recent  graduate  to  serve  in  his  chosen  branch  at  the  Platoon  Leader  level;  all 
lieutenants  must  pass  BOLC  before  departing  to  their  first  duty  station.  The 
Captains  Career  Course,  and  Command  and  General  Staff  College  (Intermediate 
Level  Education)  prepare  a  Captain  and  Major,  respectively,  to  lead  and  serve  in 
various  staff  positions.  For  lieutenant  colonels  and  above,  the  Senior  Service 
College  (i.e.,  Army  War  College)  provide  courses  to  prepare  them  to  assume 
strategic  leadership  responsibilities  in  military  or  national  security  organizations 
(Human  Resources  Command  [HRC],  2013). 

A  qualified  officer  not  only  facilitates  the  Army  mission,  but  also 
remains  competitive  in  his  career  category,  making  promotion  inevitable. 
However,  meeting  the  basic  qualifications  is  an  expectation  rather  than  an 
exception.  Thus,  exemplary  evaluations  from  superiors  play  a  significant  role  in 
identifying  qualified  officers  for  promotion. 

C.  REVIEW  OF  LITERATURE 

All  academic  organizations  have  the  success  of  their  personnel  as  their 
chief  goal.  West  Point  is  no  different.  The  undeniable  goal  of  the  mission 
statement  is  “to  produce  agile  and  adaptable  officers  who  are  developed 
intellectually  possess  the  knowledge  of  ‘how  to  think’  rather  than  simply  ‘what  to 
think’”  (USMA,  2009).  USMA,  as  an  academic  institution,  emphasizes 
intelligence  and  is  convinced  of  its  importance  to  leadership.  However,  there 
cannot  be  success  without  a  determination  to  perform  well. 
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1. 


Human  Performance 


In  identifying  those  who  possess  the  commitment  to  “supporting  and 
defending  the  Constitution”  (United  States  Army  [USA],  1999),  the  word 
performance  comes  to  mind.  Yet,  how  do  we  measure  performance 
appropriately  and  in  harmony  with  USMA’s  goals?  In  Human  Performance: 
Cognition,  Stress  and  Individual  Differences  (Human  Performance),  Matthews  et 
al.  (2000)  mention  the  significance  of  valid  performance  measurements.  Validity 
exists  in  several  forms.  First,  Criterion  Validity  refers  to  “the  ability  of  a  test  or 
measure  to  predict  some  other  intrinsically  interesting  measure”.  Another  way  of 
saying  this  is,  do  the  performance  measures  relate  to  the  organization’s  goals? 

Secondly,  Construct  Validity  considers  whether  the  performance  measure 
assesses  a  meaningful  theoretical  construct  (Matthews  et  al.,  2000).  For 
example,  does  the  performance  measure  (e.g.,  Intelligence  Quotient  Test)  relate 
to  the  theory  behind  it  (true  Intelligence)?  Construct  Validity  is  interesting 
because  it  presupposes  the  theory  is  well  developed  and  understood.  For 
instance,  measuring  intelligence  assumes  we  know  what  “true  intelligence”  is  and 
assumes  we  can  accurately  measure  it  (Carter,  1991). 

A  related  concept  is  reliability.  It  refers  to  a  performance  measures’  ability 
to  yield  similar  outcomes  over  time.  Matthews  et  al.,  say,  “If  measurement  of 
either  the  ability  or  criterion  is  unreliable,  then  high  correlations  between  skill  and 
other  measures  cannot  be  expected...”  (Matthews  et  al.,  2000). 

The  authors  of  Human  Performance...  offer  two  approaches  to  measuring 
performance  in  the  context  of  personnel  selection.  The  first  approach  uses 
general  mental  ability  assessments  (GMA),  while  the  second  approach  uses  a 
tailored  test  based  on  job  demand.  GMA  tests  a  broad  range  of  abilities  and  is 
increasingly  valid  when  assessing  intelligence-demanding  jobs.  GMA  tests, 
when  used  as  the  primary  instrument,  are  misleading  when  the  organization 
requires  personnel  to  use  physical  skill,  ethical  decision-making,  a  personality- 
centered  skill,  or  a  combination  of  the  previous  with  cognitive  skill.  Performance 
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is  a  broad  concept  that  must  be  described  and  designed  for  each  organization 
(Carter,  1991).  Furthermore,  when  speaking  of  cognitive  abilities,  Matthews  et 
al.  mention,  “for  accurate  prediction  of  performance,  knowledge  of  more  specific 
abilities  is  also  necessary....”  Thus,  even  for  intelligence  measurement, 
multidimensional  testing  is  necessary  to  measure  a  person’s  true  performance.  It 
is  easy  to  see  how  the  multidimensional  testing  becomes  a  series  of  tailored 
tests,  with  the  benefits  being  shared  confidence,  among  selectee  and 
organization,  in  the  expected  outcome(s). 

Neil  Carter  took  this  a  step  further  and  defined  outputs  and  outcomes  in 
his  1991  article  Learning  to  Measure  Performance:  The  Use  of  Indicators  in 
Organizations.  Carter  (1991)  asserted  that  outputs  eclipsed  the  goal  of  quality 
and  customer  satisfaction,  and  he  used  a  financial  example  to  communicate  his 
point: 

To  assess  the  meaning  of  profits  (or  of  alternative  key  indicators) 
involves  forming  a  judgment  on  the  performance  not  just  of  the  firm 
in  question  but  of  its  competitors,  as  well  as  strategic  judgments 
about  the  long-term  effects  of  current  pricing  and  investment 
decisions.  (Carter,  1991) 

Although  the  present  work  will  not  discuss  financial  aspects,  the  spirit  of 
Carter’s  claim  is  in  finding  the  right  indicators  to  measure  true  (and  lasting) 
performance.  Alas,  the  ability  to  measure  outcomes  (e.g.,  qualified  and  agile- 
minded  U.S.  Army  officers)  versus  mere  outputs  (e.g.,  graduates),  irrespective  of 
quality,  is  difficult.  Performance  measurement  must  begin  with  an  organizations’ 
definition  of  success,  goals,  and  outcomes,  and  then  work  backwards  from  there 
(Carter,  1991). 

Another  recent  study  that  drew  conclusions  on  the  indicators  of  USMA 
cadet  performance  is  a  Fielding  Graduate  University  doctoral  dissertation  by 
Jennifer  Clark  (2007).  Clark  identified  three  performance  goals  for  USMA 
cadets.  In  the  opening  paragraph  of  her  dissertation,  two  goals  were  identified, 
namely,  the  “ultimate  success”  for  a  USMA  cadet  is  graduation  from  the  academy 
and  a  productive  career  as  a  commissioned  officer  of  the  United  States  Army.” 
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The  third  goal,  in  concert  with  her  research  objective,  was  academic  performance 
as  measured  by  cumulative  grade  point  average  (GPA)  (Clark,  2007). 

In  her  dissertation,  Clark  purposed  to  identify  the  relationship  between 
sleep  quality  and  sleep  characteristics,  to  include  a  Morningness-Eveningness 
characteristic,  and  personality  factors  as  described  by  the  Five  Factor  Model 
(FFM)  to  determine  their  effect  on  one’s  academic  performance  at  USMA  (Clark, 
2007).  The  FFM  asserts  that  (the  personality  factors)  Openness  to  Experience, 
Conscientiousness,  Extraversion,  Agreeableness,  and  Neuroticism  best  describe 
an  individual’s  personality  (Costa  &  McCrae,  2013;  McCrae  &  John,  1992).  A 
discreet  device  called  an  Actigraphy  Watch,  worn  on  the  wrist,  measured  sleep 
quality.  Freshmen  cadets  who  participated  in  the  study  wore  an  Actigraphy 
watch  at  various  times  during  the  year. 

Morningness  (Horne  &  Ostberg,  1977)  is  a  character  trait  used  to  describe 
as  one  “predisposed  to  waking  up  earlier  rather  than  later  and  to  going  to  bed 
earlier  rather  than  later”  (Clark,  2007).  The  opposite  is  true  of  Eveningness. 
Clark  hypothesized  that  “...given  the  characteristics  of  evening  and  morning 
types  and  the  effect  of  lack  of  sleep  on  performance... daytime  sleepiness  can 
lead  to  decreased  attention,  concentration,  and  poorer  academic  performance.” 
Secondly,  Clark  hypothesized  “...personality  factors  may  ameliorate  some 
negative  effects  of  sleep  habits,  sleep  quality,  or  sleep  quantity  in  order  to  enable 
one  to  successfully  meet  the  mental,  physical,  and  emotional  challenges  of  their 
environment”  (Clark,  2007).  Clark  found  Morningness  positively  related  to  FFM 
Conscientiousness,  and  as  a  byproduct  of  her  research,  Conscientiousness 
positively  related  to  academic  performance,  regardless  of  sleep  quality  attained. 
However,  Clark’s  research  concluded  personality  traits,  sleep  quality,  and 
Morningness-Eveningness  were  unrelated  to  better  academic  performance. 

Coincidentally,  Clark’s  literature  review  noted  findings  she  later  confirmed 

in  her  own  study.  Clark  cited  researchers  (Trockel,  Barnes,  &  Egget,  2000)  who 

found  “examples  of  individuals  who  averaged  less  than  five  hours  of  sleep  per 

night  and  yet  did  not  have  low  GPAs,  indicating  that  there  may  be  other 
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mitigating  or  moderating  factors  in  place  which  affect  the  relationship  between 
sleep  and  GPA”  (Clark,  2007).  Additionally,  Clark  cites  previous  research 
(Chamorro-Premuzic  &  Furnham,  2003)  showing  personality  as  mildly  important 
in  predicting  academic  success;  FFM  facets  account  for  15-17  percent  of  the 
variance  in  academic  success,  thus,  revealing  the  possibility  for  other  domains  to 
account  for  the  remaining  variance.  Although  Clark  originally  hypothesized  sleep 
quality  and  quantity  as  the  culprit  of  performance  variance,  her  findings  proved 
contradictory.  Thus,  we  explore  a  different  characteristic  in  this  thesis,  namely, 
hardiness.  In  the  next  section,  we  see  how  crucial  hardiness  is  to  performance. 

2.  Stress,  Personality  and  Performance 

What  accounts  for  how  one  cadet  overcomes  adversity  while  others 
flounder  and  fail?  Bartone,  Snook,  and  Tremble  (2002)  published  an  article  that 
considered  the  USMA  admission  process  in  order  to  identify  the  pre-admission 
attributes  that  best  contribute  to  leader  performance  at  West  Point.  In  Cognitive 
and  Personality  Predictors  of  Performance  in  West  Point  Cadets,  Bartone  et  al. 
(2002)  used  hierarchical  multiple  regression  procedures  to  find  pre-admission 
variables  that  predicted  military  development  (MD)  grades  of  upperclassmen 
three  to  four  years  later.  The  MD  grade  measured  the  most  enduring  and 
desired  quality  of  West-Pointers  and  U.S.  Army  officers  leadership  performance. 

Cadets  received  the  MD  grade  from  an  immediate  cadet  supervisor  and 
their  Army  officer  supervisor.  The  supervisors  generated  subjective  summations 
based  on  12  military-centered  dimensions:  duty  motivation,  military  bearing, 
teamwork,  influencing  others,  consideration  for  others,  professional  ethics, 
planning  and  organizing,  delegating,  supervising,  developing  subordinates, 
decision-making,  and  oral  and  written  communication.  For  cognitive  predictors, 
Bartone  et  al.,  assessed  entering  freshman  (termed  “new  cadets”)  during  their 
first  USMA  experience,  a  summer  military  training  program  deemed  “Beast 
Barracks,”  in  the  following  types  of  batteries:  Spatial  Judgment,  Logical 
reasoning,  Social  Judgment  and  Problem  Solving.  Additionally,  the  College 


10 


Entrance  Equivalency  Rating  (CEER),  collected  as  part  of  the  admissions 
process  was  used  (CEER  is  computed  by  combining  a  weighted  average  of  high 
school  class  rank  and  pre-college  Scholastic  Aptitude  Test  scores). 

Lastly,  FFM  analog  indicators  showed  CEER  as  the  main  predictor  of 
leader  performance  three  to  four  years  later  (Bartone,  Snook,  &  Tremble  Jr., 
2002).  This  is  surprising  since  CEER  is  academically  oriented  while  the  MD 
grades,  based  on  the  twelve  dimensions,  appear  relatively  non-academic.  The 
CEER— MD  grade  connection  reveals  one  or  two  possibilities.  Bartone  et  al. 
(2002)  suggest  that  either  supervisors/raters  ascertain  the  intelligence  of  rated 
cadets  before  judging  military  performance  or  CEER  is  really  measuring 
something  else,  namely,  underlying  personality  characteristics  positively  related 
to  military  performance. 

According  to  Bartone  et  al.  (2002),  overall  variance  in  leader  performance 
was  “modest,  leaving  much  unexplained... statistical  significance  among 
predictors  does  not  amount  to  variance  (R2=.05)  being  accounted  for.” 

Digman  (1990)  details  the  development  of  the  FFM,  claiming  that  five 
dimensions  adequately  describe  normal  personality.  Digman  chronicled  the 
evolution  of  the  FFM  and  documented  studies  supporting  the  robustness  of  the 
FFM.  Yet,  Digman  admitted  issues  with  defining  the  personality  dimensions.  He 
argued  that  the  FFM,  at  the  very  least,  provided  a  broad  standard  and  a 
“surprisingly  general  theoretical  structure”  for  measured  personality  (Digman, 
1990).  Digman  ended  the  article  with  the  following  statement: 

The  why  of  personality  is  something  else.  If  much  of  personality  is 
genetically  determined,  if  adult  personality  is  quite  stable,  and  if 
shared  environment  accounts  for  little  variability  in  personality,  what 
is  responsible  for  the  remaining  variance?  Perhaps  it  is  here  that 
the  idiographic  (i.e.,  idiosyncratic)  study  of  the  individual  has  its 
place.  Or  perhaps  we  shall  have  to  study  personality  with  far 
greater  care  and  with  much  closer  attention  to  the  specifics  of 
development  and  change  than  we  have  employed  thus  far. 
(Digman,  1990) 
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Previous  research  inspired  further  exploration  of  personality  dimensions. 
Our  hope  is  to  find  a  more  descriptive  factor  to  help  us  study  the  individual.  That 
brings  us  to  the  hardiness. 

3.  Hardiness 

Hardiness  is  “a  pattern  of  attitudes  and  skills  that  provides  the  existential 
form  of  courage  and  motivation  needed  to  learn  from  stressful  circumstances  in 
order  to  determine  what  will  be  the  most  effective  performance”  (Maddi, 
Matthews,  Kelly,  Villarreal,  &  White,  2012).  According  to  Maddi  et  al. ,  hardiness 
is  composed  of  three  areas  applicable  to  a  variety  (both  physical  and  mental)  of 
situations  and  occupations. 

The  first  area  is  general  commitment  (versus  alienation)  to  work  and  life. 
A  person  high  in  hardiness-commitment  remains  vigorously  engaged  or  involved 
with  others  and  activities.  The  second  area  is  high  sense  of  control  (versus 
powerlessness)  which  urges  a  person  to  persevere  so  that  his  efforts  influence 
events  and  outcomes.  The  last  area  of  hardiness  is  the  ability  to  assess  difficult 
and  trying  situations  and  use  them  as  a  challenge  to  grow  (versus  a  threat  to 
avoid).  An  individual  high  in  hardiness-challenge  is  open  to  variety  and  changes, 
which  are  seen  as  an  opportunity  to  further  develop  through  what  is  learned 
(Maddi  et  al.,  2012). 

Bartone,  Eid,  Johnsen,  Laberg,  and  Snook  (2009)  summarized  a  similar 
study  on  personality  factors  as  indicators  of  performance.  The  study  evaluated 
the  influence  of  psychological  hardiness,  social  judgment,  and  FFM  personality 
dimensions  on  leader  performance  in  USMA  cadets.  The  study  used  the 
following  factors  as  potential  predictors  of  leader  performance:  gender,  CEER, 
social  judgment,  FFM-factors  and  hardiness.  The  Bartone  et  al.  (2009)  study 
measured  leader  performance  similar  to  Bartone  et  al.  (2002);  however, 
researchers  collected  MD  grades  from  two  different  periods — summer  and 
academic  year.  The  first  measurement  averaged  all  three  MD  grades  received 
during  the  first  three  summers  at  USMA,  while  the  second  measurement 
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averaged  all  MD  grades  from  the  four-year  academic  periods.  Lastly,  a 
combined  leader  performance  measure  averaged  the  two  previous  MD  outcomes 
(summer  and  academic  year).  After  controlling  general  intellectual  abilities, 
hierarchical  regression  results  showed  FFM  extroversion,  hardiness,  and  a  trend 
for  social  judgment  predict  leader  performance  in  the  summer  field-training 
environment.  During  the  academic  period,  leader  performance  predicted  mental 
abilities  (CEER),  FFM  conscientiousness,  and  hardiness,  with  a  trend  for  social 
judgment  (Bartone,  Eid,  Johnsen,  Laberg,  &  Snook,  2009). 

In  addition,  Bartone  et  al.  (2009)  found  evidence  of  a  relationship  between 
hardiness  and  FFM’s  extroversion  and  conscientiousness.  Therefore,  FFM 
factors  in  combination  with  hardiness  present  significant  results,  with  hardiness 
being  the  strongest  predictor.  Additionally,  the  Bartone  et  al.  (2002)  study  used 
FFM  analog  indicators  as  potential  predictors  of  leader  performance,  finding 
limited  correlation  between  dependent  variable  and  response.  In  fact,  three  of 
the  five  FFM  factors  exhibited  multicollinearity,  while  the  remaining  two  showed 
no  correlation. 

These  findings  suggest  that  hardiness  may  capture  much  of  the 
performance  variance  we  are  interested  in  and  aid  in  identifying  the 
multicollinearity  concerns  among  FFM  factors.  The  FFM  is  unique  in  that  it 
developed  the  better  part  of  a  half-century  into  the  “unified  framework”  for 
understanding  normal  personality  that  exists  today  (Bartone  et  al.,  2009).  For 
now,  the  FFM  uses  the  five  factors  of  openness,  conscientiousness, 
extraversion,  agreeableness,  and  neuroticism;  however,  a  growing  collection  of 
literature  states  that  FFM  “may  not  fully  represent  all  of  the  personality-based 
differences  potentially  impacted  on  leadership  and  job  performance”  (Bartone  et 
al.,  2009).  In  fact,  Block  (1995)  and  Hough  (1992),  both  cited  by  Bartone  et  al. 
(2009),  echo  criticisms  of  the  FFM,  addressing  the  shortfalls,  namely  breadth,  of 
the  five  factors  to  predict  performance  with  the  desired  specificity.  Two  additional 
articles,  Duckworth,  Matthews,  Kelly  &  Peterson  (2007)  and  Bartone,  Kelly,  and 
Matthews  (2013),  take  a  similar  approach,  commenting  not  only  on  the  generality 
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of  the  five  factors,  but  also  citing  insignificant  correlations  with  leader 
performance  as  a  reason  to  disregard  the  FFM. 

The  aforementioned  studies  promote  hardiness  as  a  relevant  predictor  of 
leader  performance.  Additional  results  from  Bartone  et  al.,  (2009)  follow: 

•  Hardiness  uncorrelated  with  gender  or  CEER. 

•  Gender  was  not  deterministic  of  summer  leadership  performance. 

•  CEER  (mental  ability)  was  not  deterministic  of  summer  leadership 
performance. 

•  CEER  is  a  significant  predictor  of  academic  leader  performance. 

•  Hardiness  is  the  most  significant  predictor  in  both  training  contexts. 

•  CEER  negatively  relates  to  FFM  neuroticism  and  extroversion. 

•  CEER  positively  relates  to  FFM  agreeableness  and 
conscientiousness. 

These  results  appear  to  reveal  that  FFM  is  useful  in  academic  contexts  at 
associating  a  single  factor  with  a  performance  outcome.  However,  when  the 
environmental  context  shifts,  that  predictor  may  not  be  significant.  For  instance, 
Conscientiousness  does  well  in  predicting  outcomes  within  academic  context,  but 
not  necessarily  in  the  summer  (training)  context. 

Hardiness,  on  the  other  hand,  is  detectable,  regardless  of  the  context  or 
situation,  provided  there  is  adversity  to  overcome.  Perhaps  the  hardy  individual 
is  blind  to  context;  his  hardiness  shines  through  regardless.  Bartone  et  al.  state, 
“hardiness  emerges  in  this  study  as  the  strongest  personality  predictor  of  leader 
performance,  and  the  only  personality  factor  predicting  leader  performance 
across  the  two  different  contexts”  (Bartone  et  al.,  2009). 

Interestingly,  hardiness  is  a  variable  already  collected  by  USMA  on  its 
cadets  during  their  first  summer.  Although  few  (West  Point)  entities  analyze 
hardiness  data,  it  is  stored  in  a  database  managed  by  OIR.  On  the  genesis  of 
hardiness  testing  at  West  Point,  USMA  Professor  of  Engineering  Psychology, 
Michael  D.  Matthews  wrote: 
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The  genesis  of  hardiness  research  here  traces  back  to  research 
conducted  by  COL  Paul  Bartone.  Paul  did  his  doctoral  dissertation 
at  the  University  of  Chicago,  on  the  topic  of  hardiness,  and 
mentored  by  the  scientist  who  invented  the  concept,  Dr.  Salvatore 
Maddi.  Prior  to  joining  the  West  Point  faculty  in  the  late  1990s, 

Paul  had  already  conducted  hardiness  research  on  other  military 
populations.  The  decision  to  conduct  hardiness  research  here  was 
simply  to  extend  existing  hardiness  research  to  this  particular 
setting/  venue.  Since  Paul  departed  in  2003,  Dr.  Dennis  Kelly  and  I 
have  continued  this  line  of  research.  To  clarify  a  bit  more,  USMA 
did  not  have  a  role  in  the  genesis  of  this  research.  It  was  scholar- 
driven,  not  institution  driven.  (Matthews,  2013) 

Kelly  and  Matthews  conducted  a  recent  study  (2012)  with  hardiness 
creator  Dr.  Salvatore  Maddi  entitled  The  Role  of  Hardiness  and  Grit  in  Predicting 
Performance  and  Retention  of  USMA  Cadets.  The  research  resolved  to  identify 
the  relationship  between  hardiness,  grit2,  performance,  and  retention  of  plebes 
(freshmen  cadets)  “above  and  beyond”  the  Whole  Candidate  Score  (WCS).  The 
WCS,  when  used  in  combination  with  other  pre-admission  data,  is  the  primary 
predictor  for  West  Point  cadet  academic,  military,  and  physical  performance 
(Maddi,  Matthews,  Kelly,  White,  &  Villarreal,  2012).  Maddi  et  al.  defined  WCS  in 
the  following  way: 

[WCS  is]  a  weighted  composite  score  that  measures  high  school 
academic  performance  (e.g.,  Grade  Point  Average  (GPA),  high 
school  rank,  and  SAT  scores),  leadership  potential  (involvement  in 
leadership  roles  within  extracurricular  activities — that  is,  school 
officers,  scouting,  debate,  and  faculty  appraisals)  and  physical 
fitness  (performance  on  standardized  physical  exercises).  (Maddi 
etal.,  2012) 

A  dichotomous  variable  (1=  retained  beyond  year  one,  0=  separated 
within  year  one)  characterized  retention.  The  cadet  performance  score  (CPS) 
measured  first  year  performance.  Similar  to  WCS,  the  CPS  is  a  weighted 
composite  measure  of  performance  across  three  USMA  developmental 


2  Duckworth  et  al.  (2007)  defines  grit  as  “perseverance  and  passion  for  long-term  goals... grit 
entails  working  strenuously  toward  challenges,  maintaining  effort  and  interest  over  years  despite 
failure,  adversity,  and  plateaus  in  progress.” 
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programs — academic,  military,  physical  (United  States  Corps  of  Cadets  [USCC], 
2012).  Regression  analyses  revealed  the  following  results: 

•  WCS,  hardiness,  and  grit  predict  first  year  (cadet)  retention;  yet,  grit 
proved  the  most  important  predictor.  Retained  cadets  possessed 
higher  grit  scores. 

•  WCS,  grit,  and  hardiness  associate  with  CPS;  yet,  WCS  and 
hardiness  scores  uniquely  predict  CPS  scores.  Moreover, 
hardiness  predicted  unique  variability  in  CPS  after  controlling  WCS 
scores. 

Maddi  et  al.,  (2012)  broke  from  established  methods  (e.g.,  FFM,  Mental 
Ability)  of  measuring  performance  when  they  explored  hardiness;  however,  a 
self-assessed  hardy  person  may  be  a  poor  and  lazy  leader,  one  who  neither 
ventures  into  leadership  roles  nor  seeks  challenges  to  overcome.  Similarly,  an 
individual  strong  in  academics  who  is  high  in  hardiness-challenge  may  perform 
worse  in  (military)  leadership  tasks  (Bartone,  Kelly,  &  Matthews,  2013)  because 
he  takes  unnecessary  risks.  Therefore,  like  any  personality  measure,  haphazard 
application  of  the  hardiness  score  may  lead  to  erroneous  results. 

Bartone  et  al.,  (2013)  evaluated  whether  psychological  hardiness  at  entry 
to  West  Point  predicts  leader  performance  and  adaptability  over  time.  The 
authors  defined  hardiness  (psychological)  as  a  “constellation  of  personality 
qualities  found  to  characterize  people  who  remain  healthy  and  continue  to 
perform  well  under  a  range  of  stressful  conditions”.  New  to  this  discussion  is  the 
concept  of  adaptability. 

USMA  defines  adaptability  as  “a  cadet’s  ability  to  anticipate  and  respond 
effectively  to  the  demands  of  multiple  competing  responsibilities”  (USMA,  2009). 
Again,  USMA’s  leader  development  goal  is  to  produce  officers  who  are  agile, 
adaptable  and  intellectually  developed — possess  knowledge  of  ‘how  to  think’ 
rather  than  simply  ‘what  to  think’”  (USMA,  2009).  Bartone  et  al.  (2013)  seem  to 
agree  adaptability  is  “effective  change  or  adjustment  in  response  to  changing 
conditions.”  The  authors  (Bartone  et  al.,  2013)  expressed  support  for  an 
adaptability  scale  developed  by  Pulakos,  Arad,  Donovan  and  Plamondon  (2000). 
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The  adaptability  scale  measures  eight  dimensions  of  adaptability  performance — 
Bartone  et  al.  (2013)  created  a  ten-item  survey  consistent  with  that  of  Pulakos  et 
al.  (2000). 

The  adaptability  study  also  used  scholastic  aptitude  test  (SAT)  scores  and 
WCS  as  potential  predictors  of  performance.  Specifically,  criterion  variables 
were  cumulative  military  performance  score  (MPS),  self-rated  adaptability,  and 
supervisor-rated  adaptability.  MPS,  captured  during  the  senior  year,  provided  an 
index  of  traditional  military  performance.  Adaptability  measures,  on  the  other 
hand,  were  collected  three  years  after  graduation,  using  the  adaptability  scale 
(self-rated)  and  a  survey  (supervisor-rated)  administered  by  West  Point’s 
institutional  assessment  committee  (IAC).  The  IAC  is  USMA’s  feedback 
mechanism,  validating  the  achievement  of  its  educational  program  goals  three 
years  after  each  class  graduated  by  survey  or  interview  from  the  graduates’ 
superior/commanding  officer  (USMA,  2009).  Findings  from  Bartone  et  al.,  (2013) 
are  as  follows: 

•  SAT  and  hardiness-challenge  are  negative  predictors  of  leader 
performance 

•  The  pattern  suggests  that  the  more  intelligent  (SAT  score)  and 
adventurous  (hardiness-challenge)  cadets  do  not  perform  as  well 
as  the  less  intelligent  in  the  conventional  military  and  leadership 
tasks  in  the  West  Point  environment. 

•  SAT  scores  do  not  relate  to  MPS  or  Adaptability. 

•  WCS  predicts  USMA  leader  performance,  but  not  adaptability  post- 
USMA. 

•  The  stable  and  highly  regulated  environment  of  West  Point  stands 
in  contrast  to  the  uncertain  real-world  operational  environment. 

•  Hardiness  (commitment,  control)  predicts  leader  performance  at 
USMA,  self-rated  adaptability  and  supervisor-rated  adaptability 
after  graduation. 

•  Psychological  hardiness  (commitment  and  control  facets) 
measured  as  academy  freshmen  predict  leader  adaptability  in 
officers  for  seven  years  or  more  after  graduation. 
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•  Hardiness-commitment  correlated  with  USMA  military  performance, 
and  with  later  self-ratings  of  adaptability,  but  not  with  commander 
ratings. 

•  Hardiness-control  showed  a  significant  correlation  with  military 
performance  at  West  Point,  and  correlates  with  self  and 
commander  ratings  of  adaptability. 

Lastly,  Bartone  et  al.,  (2013)  concluded  “hardiness... and  the  facets  of 
commitment,  control  and  challenge  appear  to  be  distinct  from  FFM  personality 
dimensions. ..it  is  conceivable  that  some  important  personality  characteristics  are 
not  captured  by  the  FFM...”  However,  the  authors  also  acknowledged  that  the 
hardiness  facets  may  operate  somewhat  independently;  hence,  it  is  worth  looking 
at  the  facets  individually,  alongside  the  total  hardiness  score. 

Table  1  (located  in  section  I.D),  contains  a  summary  of  the  literature 
review. 

4.  Personnel  Selection 

Cook  (2009)  stresses  the  importance  of  people  in  any  organization.  So 
much  is  the  value  of  people  that  organizations  spend  large  amounts  of  time  and 
money  finding  (e.g.,  advertising  or  recruiting)  the  right  person  for  the  job.  Cook 
asserts,  “Employees  vary  greatly  in  value,  so  selection  matters... selection  uses  a 
range  of  tests  to  assess  a  range  of  attributes”.  Nevertheless,  before  a  selection 
instrument  is  used,  the  organization’s  leaders  must  agree  on  suitability  criterion 
for  performance  outcomes.  Doing  so  will  set  a  standard  and,  like  a  net,  “catch” 
qualified  personnel.  Remarkably,  exploring  the  performance  outcome  criterion 
generated  questions  about  the  real  nature  of  work  or  the  true  purpose  of  the 
organization  (Cook,  2009). 

Cook  developed  a  personnel  selection  model  for  a  British  university.  We 
briefly  considered  this  model  before  we  looked  at  USMA’s  personnel  selection 
system.  Figure  1  shows  a  modified  version  of  Cook’s  model. 


18 


Figure  1 .  Personnel  Selection  Model  (After  Cook,  2009) 


In  order  to  spare  the  reader  an  extensive  explanation  of  Cook’s  model,  we 
insert  notes  of  clarification  to  assist  in  understanding  USMA’s  selection  model. 
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a.  Applicant 

Before  an  applicant  begins  the  application  process,  he  must  meet 
all  basic  requirements  (e.g.,  minimum  educational  experience,  citizenship). 

b.  Application 

The  application  stage  is  a  screening  process.  The  self-report 
application  elements  (mental  ability,  training  and  experience  ratings,  background, 
personality,  etc.)  convert  into  quantified  measures  by  using  weighted  application 
blanks  (WAB).  WABs  are  useful  as  predictors  of  performance  when  employed 
correctly  (Cook,  2009). 

c.  Evidence  of  Abilities 

The  evidence  stage  searches  for  concrete  proof  of  an  applicant’s 
true  knowledge,  skills  and  abilities.  Demonstrated  ability,  vice  self-reported 
ability,  manifests  itself  through  testing,  simulations,  or  exercises.  The  test  or 
exercise  helps  develop  an  index  of  the  applicant’s  work  performance.  However, 
the  organization  must  agree  on  the  definition  of  “successful”  performance.  Cook 
(2009)  identified  several  questions  about  the  true  nature  of  work  and  purpose  of 
the  organization  while  exploring  performance: 

•  Is  success  measured  best  by  counting  objects  produced  or  by 

subjective  opinion? 

•  Who  decides  whether  work  is  successful? 

•  Does  the  organization  and  its  customers  agree? 

The  researcher  adds: 

•  Is  the  selection  instrument  correlated  with  job  tasks? 

The  remaining  stages  of  the  modified  model  (References, 
Questioning/Interview)  validate  the  written  application  through  personal 
interaction.  Both  stages,  when  credible  and  convincing,  will  increase  the  strength 
of  an  application. 
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5.  Use  of  Mental  Ability  Testing  in  Personnel  Selection 

The  validity  of  Mental  Ability  (MA)  tests  is  an  important  portion  of  individual 
performance  and  increases  with  job  complexity  (Cook,  2009).  The  universal 
assumption  here  is  complex  jobs  require  high  mental  faculties.  However,  USMA 
does  not  expect  its  cadets  to  know  every  answer  to  every  single  problem;  rather, 
USMA  expects  its  leaders  to  be  tough-minded  thinkers  capable  of  knowing  where 
to  find  elusive  answers.  It  suffices  to  say,  MA  is  not  a  panacea.  Recall  the 
recent  studies  showing  intelligence  as  not  indicative  of  Adaptability  or  Summer 
Field  Training  performance  (Bartone  et  al.,  2009;  Bartone  et  al.,  2013).  Cook 
(2009)  mentioned  a  feasible  use  for  MA  testing,  that  MA  tests  predict  training 
success  well  on  timed,  multiple-choice  exams.  Conversely,  MA  is  not  good  for 
predicting  personality,  measuring  leadership,  or  work  quality. denote 

MA  testing  has  become  unpopular  and  controversial  due  to  gender,  racial 
bias  and  varying  validity.  In  fact,  many  American  employers  abandoned  MA 
testing  after  passage  of  the  1964  Civil  Rights  Act,  yet  many  adverse  impacts  still 
have  not  been  resolved  (Cook,  2009).  The  job  description  of  the  U.S.  Army 
officer  provides  evidence  as  to  why  USMA  will  not  use  MA  as  a  sole  selection 
tool.  U.S.  Army  officers  are  multi-dimensional  and  expected  to  possess  more 
than  intellectual  prowess.  The  U.S.  Army’s  “Project  A”  found  MA  somewhat 
positively  related  to  work  performance.  Project  A  showed  the  following 
correlations3  between  work  performance  and  three  “motivational  aspects  of 
work”:  effort  and  leadership  ( r  =.31),  personal  discipline  ( r  =.16),  and  physical 
fitness  &  military  bearing  ( r  =.20)  (Cook,  2009;  McHenry,  Hough,  &  Toquam, 
Hanson,  1990).  Cook  (2009)  continued,  “personality  tests,  structured  interviews, 
and  work  samples  are  worth  being  used  alongside  MA  to  offer  incremental 
validity.” 


3  We  mention  correlation  several  times  in  this  thesis  and  represent  it  with  the  letter  “r." 
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USMA  followed  a  similar  approach  by  using  a  myriad  of  assessments 
alongside  MA.  For  instance,  although  personality  testing  does  not  occur  prior  to 
selection,  USMA  can  no  doubt  identify  a  candidate’s  personality  framework  from 
information  (e.g.,  biographical,  social)  collected  prior  to  admission.  An 
individual’s  demographics,  membership  to  various  organizations,  letters  of 
recommendation  and  educational  experience,  among  others,  play  an  important 
role  in  helping  USMA  identify  future  leaders  of  the  Nation.  In  the  next  section,  we 
discuss  USMA’s  personnel  selection  model  and  admissions  process. 

6.  Personnel  Selection  at  USMA 

USMA’s  objective  and  quantitative  selection  process  essentially  reduces 
applicants  to  a  series  of  numbers.  As  impersonal  as  it  sounds,  a  quantitative 
process  is  warranted  due  to  the  large  number  of  applicants  and,  more 
importantly,  because  of  predictive  capability.  Many  studies  (Dawes,  1971,  1974, 
1977  &  1979)  document  quantitative  means  (e.g.,  regression  equations)  as 
superior  to  subjective  assessments  (e.g.,  admissions  committee  member 
ratings).  USMA  may  inspire  over  15,000  high  school  juniors  to  open  a  candidate 
file  (USMA,  2013c).  This  every-candidate-a-number  concept  allows  for  objective 
measurement  against  USMA  standards  and  comparison  among  individuals. 
Generally,  candidates  with  higher  numbers  are  more  qualified;  however,  this  is 
not  always  the  case.  To  prove  this  point,  a  member  of  USMA  admissions 
department  wrote,  “The  subjectivity  only  comes  in  if  we  request  WCS  bonuses4 
for  a  candidate's  file  or  during  the  nomination  process  (congressional  districts 
can  use  the  principal  nomination  process  that  allows  them  to  choose  their 
vacancy  winner  regardless  of  their  WCS)”  (Unger,  2013).  Of  the  15,000 
candidates,  over  4,000  received  congressional  nominations.  Yet,  USMA  reduced 
their  pool  of  interested  applicants  to  approximately  1200,  less  than  or  equal  to 


4  In  the  event  that  a  particular  candidate’s  WCS  insufficiently  captures  his  true  potential, 
USMA  admissions  may  award  WCS  bonus  points. 
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the  number  of  appointments  allowed  by  congress.  A  nominated  candidate  must 
patiently  wait  for  acceptance  status  while  his  entire  file  undergoes  meticulous 
scrutiny. 

This  thesis  centers  on  the  whole  candidate  concept  and  scrutinizes  past, 
present,  and  potential  performance  in  USMA’s  top  three  areas:  academic  ability 
(60  percent),  leadership  potential  (30  percent),  and  overall  fitness  (10  percent) 
(USMA,  2013c).  Previous  research  revealed  the  connection  between  the  three 
areas,  performance  and  retention  (graduation  versus  separation).  Therefore, 
USMA  maintained  what  they  call  “risk  levels  and  required  checks,”  a  list  of  cut-off 
scores  that  must  be  met  for  each  candidate  for  each  category.  For  example,  an 
SAT-Verbal  score  <  560,  CEER  <  520,  or  WCS  <  5200,  alerts  the  admissions 
department  to  the  potential  risk  of  admitting  a  candidate.  On  the  other  hand,  a 
community  leadership  score  (CLS)  >  650  or  CEER  >  650  identifies  a  candidate 
as  a  scholar  or  leader,  respectively  (USMA,  circa  1996).  An  explanation  of  the 
tool  and  equations  West  Point  used  to  quantify  candidates  is  located  in  Appendix 
B. 


7.  Performance  Measurement  at  USMA 

A  key  to  the  success  of  any  organization  is  holistic  adherence  to  its 
standard  operating  procedures  (SOP).  The  SOP  serves  as  a  guiding  light  to 
employees,  often  showing  them  how  or  when  their  mission  is  complete.  The 
USMA  SOP  documents  command  and  administration  topics  for  the  United  States 
Corps  of  Cadets  (USCC),  to  include  obligations,  definitions,  standards, 
authorizations  and  privileges.  It  covers  topics  such  as  military  courtesy,  uniform 
wear  and  appearance,  behavioral  conduct,  accountability,  academic  policies, 
etc.,  (United  States  Corps  of  Cadets  [“USCC”],  2012).  Before  graduation,  a 
cadet  must  meet  the  many  expectations  outlined  in  USCC’s  SOP. 

In  particular,  and  critical  to  the  topic  of  performance  measurement,  a  cadet 
must  fulfill  the  academic,  military,  and  physical  program  standards  prior  to 
graduation.  Subordinate  to  the  USCC  SOP  are  each  programs’  guidebook, 
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deemed  the  Redbook  (academic  program),  Greenbook  (military  program),  and 
Whitebook  (physical  program).  Each  program  yields  an  associated  program 
score  at  the  completion  of  training  (academic  program  score— APS,  military 
program  score— MPS,  and  physical  program  score— PPS).  These  scores  allow 
comparisons  between  cadets  and,  when  combined,  create  the  CPS.  USMA 
assigned  the  following  weights  to  the  CPS  elements: 

CPS  =  ,55(APS)  +  ,30(MPS)  +  ,15(PPS)  (Special  Assistant  to  the 
Commandant  for  Stategic  Planning  [Planning],  2010). 

Earlier,  we  saw  similar  weights  assigned  to  academic,  leadership,  and 
fitness  domains  of  the  Whole  Candidate  Concept.  For  context,  we  mentioned 
several  definitions. 


a.  Academic  Program  Score 

Performance  in  courses  within  the  academic  program  comprises 
the  APS.  It  does  not  include  military  science  and  physical  education  courses 
(Office  of  the  Dean,  2010).  A  cumulative  APS  (APSC)  of  2.00  or  higher  is 
required  for  graduation. 

b.  Military  Program  Score  (MPS) 

The  MPS  is  the  composite  score  reflected  in  the  accumulated  cadet 
performance  in  required  military  development  and  core  military  science  courses. 
The  eleven  required  MD  courses  are  76  percent  of  the  MPS,  while  the  eight  core 
MS  courses  comprise  the  remaining  24  percent.  Annex  A  of  the  Greenbook  lists 
summer  training,  military  duty  performance  during  each  term  and  military  science 
courses  during  the  academic  year  as  elements  of  the  MPS.  The  following 
formula  conveys  how  the  MPS  is  calculated:  MPS  =  ,70(MD)  +  ,30(MS) 
(Planning,  2010) 
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c. 


Military  Development  (MD) 


The  MD  grade  is  a  subjective  evaluation  of  cadet  leader 
performance,  during  the  four-year  West  Point  experience  (USCC,  2012). 

d.  Military  Science  (MS) 

MS  consists  of  three  military  science  core  courses  designed  to 
enhance  the  professional  military  education  of  cadets  and  develop  the 
foundational  military  skills  and  troop-leading  procedures  required  of  junior 
officers.  Warfighters’  lecture  series,  joint  professional  military  education,  tactical 
decision-making  exercises,  combat  simulations  and  faculty  experience 
complement  and  reinforce  military  science  (core)  courses.  Selected  basic  officer 
leadership  course  (BOLC-A)  tasks  are  also  trained  and  evaluated  in  MS  courses 
(Planning,  2010). 

USMA  publications  confirmed  MPS  weights  are  progressive 
activities  completed  at  higher  levels  of  responsibility  generally  have  greater 
weight.  A  cadet  must  achieve  a  cumulative  MPS  (MPSC)  of  2.00  or  higher  by 
the  end  of  third-class  (junior)  year  and  maintain  it  through  the  conclusion  of  first- 
class  year. 


e.  Physical  Performance  Score  (PPS) 

The  PPS  is  an  annually  reported  score  documenting  a  cadet’s 
performance  during  instructional  coursework,  physical  fitness  testing  and 
competitive  sport  participation  (Office  of  the  Commandant  of  Cadets  [Whitebook], 
2011). 


f.  Instructional  coursework  (1C): 

1C  is  composed  of  two  areas.  The  first  area  is  comprised  of  the 
basic  activities  relevant  to  military  duties  (e.g.,  combatives,  boxing,  military 
movement,  survival  swimming,  personal  fitness  and  development).  The  second 
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area  promotes  physical  development  in  a  wide  variety  of  activities  (e.g.,  rock 
climbing,  tennis,  alpine  skiing,  cycling  and  SCUBA)  (Whitebook,  2011). 

g.  Fitness  Testing  (FT): 

FT  includes  three  areas.  First,  each  cadet  must  develop  and 
implement  a  personal  fitness  program  while  at  the  Academy.  Second,  cadets 
must  participate  in  and  pass  the  Army  physical  fitness  test  each  semester.  Lastly, 
cadets  must  pass  the  indoor  obstacle  course  test  during  their  junior  year 
(Whitebook,  2011). 

h.  Competitive  Sports: 

Competitive  sports  participation  is  vital  to  a  cadet’s  development 
and  is  a  “precursor  to  success”  (Whitebook,  2011).  General  Alexander  Haig 
noticed,  “Sports  provided  the  only  peacetime  activity  where  the  stressors 
simulated  those  on  a  battlefield.”  General  Omar  Bradley  valued  the  resultant 
group  cooperation.  General  Douglas  MacArthur,  as  USMA  Superintendent 
following  World  War  I,  required  every  cadet  to  participate  in  organized  athletics 
because  he  believed  athletes  made  the  best  Soldiers  (Whitebook,  2011).  In  fact, 
every  cadet  is  required  to  memorize  General  MacArthur’s  famous  quote,  “Upon 
the  fields  of  friendly  strife  are  sown  the  seeds  that  upon  other  fields,  on  other 
days,  will  bear  the  fruits  of  victory.” 

To  evaluate  performance,  USMA  used  a  subjective  rating  called  the 
character  in  sports  index  (CSI).  CSI  measures  the  following  characteristics: 
sportsmanship;  mental  toughness,  perseverance,  winning  spirit;  unselfishness; 
coachability,  attitude,  teachable  spirit;  playing  ability;  and  time  (Whitebook, 
2011).  A  cumulative  physical  program  score  (PPSC)  of  2.0  is  required  to 
graduate.  The  following  formula  illustrates  how  the  PPS  is  calculated: 

PPS  =  .50  (1C)  +  .30  (FT)  +  .20  (CSI). 
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8. 


Retention 


In  1987,  then  USMA  superintendent,  Lieutenant  General  (LTG)  Dave 
Palmer  revised  the  mission  statement  and,  for  the  first  time,  developed  a  purpose 
statement.  The  mission  statement  added  “...inspire  each  (graduate)  to  a  lifetime 
of  service  to  the  nation”;  the  purpose  became  “provide  the  nation  with  leaders  of 
character  who  serve  the  common  defense”  (Stanton,  1995).  The  revised 
statements  clarified  “what”  and  “why”  for  West  Point.  Interested  personnel  (e.g., 
congress,  cadet  candidates)  then  understood  the  short-term  (uniformed  service 
in  the  regular  Army)  and  long-term  (service  to  the  nation)  responsibilities  for 
graduates  (Stanton,  1995). 

In  Preparing  for  West  Point’s  Third  Century,  A  Summary  of  the  Years  of 
Affirmation  and  Change,  1986-1991,  Larry  R.  Donnithorne  writes,  “West  Point 
graduates  will  advance  in  the  Army  as  far  as  their  talents  and  the  needs  of  the 
service  take  them.  Their  dedication  to  selfless  service,  even  beyond  the  time  in 
uniform  is  both  a  national  need  and  a  historical  expectation.  They  are  to  be 
leaders  for  a  lifetime”  (Stanton,  1995).  LTG  Palmer’s  decision  later  helped  build 
USMA’s  relevance  in  the  wake  of  Army  downsizing  and  eventually  attracted 
those  committed  to  serving  national  needs,  during  and  after  uniformed  service. 

Officers  are  required  to  serve  an  active  duty  service  obligation  (ADSO) 
upon  receiving  their  commission.  The  service  obligation  is  five  years  for  USMA 
graduates  (six  years  for  aviators),  four  years  for  reserve  officer  training  corps 
(ROTC)  scholarship  recipients,  and  three  years  for  others  (non-scholarship 
ROTC  graduates,  officer  candidate  school  graduates,  and  direct  appointees) 
(Department  of  the  Army  Headquarters,  2007).  At  the  end  of  the  service 
contract,  officers  have  the  option  to  terminate  their  service  or  remain  on  active 
duty. 

Circumstances  influencing  a  decision  to  stay  or  leave  the  Army  ranks  are 
different  for  each  officer.  Between  1950  and  1981,  the  continuation  rate  of 
USMA  graduates  who  chose  to  stay  in  the  Army  beyond  six  years  decreased 
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from  77  percent  to  70  percent.  A  downward  trend  persisted  until  the  class  of 
1989,  when  the  continuation  rate  rose  from  around  45  percent  to  48  percent 
(Stanton,  1995).  Although,  LTG  Palmer’s  mission  statement  became  relevant  for 


a  period,  tolerating  the  departure  of  graduates  immediately  after  their  ADSO  for 
(government)  civilian  work,  it  failed  to  communicate  the  nation’s  true  hope  for 
West  Pointers. 

The  current  USMA  mission  statement  specifies  a  “...career  of  professional 
excellence  and  service  to  the  Nation  as  an  officer  in  the  United  States  Army.” 
The  contrast  between  the  two  mission  statements  is  that  the  old  mission 
statement  associates  career  with  both  uniformed  and  civilian  service,  whereas 
the  current  mission  statement  specifies  career  in  the  capacity  of  a  U.S.  Army 
officer.  This  distinction  might  underscore  why  USMA  officers,  over  time,  stay  in 
the  Army  beyond  their  commitment  and  why  many  depart  at  their  first 
opportunity.  As  military  officers,  graduates  under  the  current  mission  statement, 
possess  a  sense  of  commitment  and  readiness  to  serve  their  country  fully 
knowing  USMA  expects  them  to  serve  as  a  U.S.  Army  officer.  On  the  other 
hand,  officers  under  the  previous,  and  more  general,  service  to  nation  mission 
statement  might  consider  the  possibility  of  civilian  service  at  some  point. 

Retention  is  an  important  issue  for  any  organization  that  wants  to  maintain 
an  educated,  highly  specialized  and  all-volunteer  force.  For  the  U.S.  Army, 
factors  that  affect  the  turnover  of  Army  officers  are  subjective  and  difficult  to 
measure  accurately.  Yet,  the  reasons  an  officer  stays  or  leaves  seem 
comparable  to  those  in  the  civilian  population.  In  an  NPS  thesis,  Gene  (2008) 
hoped  to  identify  reasons  for  separation  of  USMA  graduates  by  investigating 
deployment  length  and  deployment  frequency  during  the  global  war  on  terrorism 
(GWOT),  year  groups  1994  through  2001. 

Gene’s  research  revealed  the  main  factors  that  affected  retention  were 

economics  (better  job  options,  higher  earnings),  better  (or  stable)  living  locations, 
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satisfaction  with  military  life,  harmony  of  dependents  with  military  lifestyle,  and 
psychological  reasons  (Gene,  2008).  Interestingly,  Gene  found  officers  who 
deployed  to  non-hostile  environments  for  a  period  of  more  than  fifteen  months 
were  23  percent  more  likely  to  leave  (the  Army)  than  their  pre-GWOT  peers 
(Gene,  2008).  This  signal  is  counterintuitive.  Gene  observed  non-hostile 
deployments  negatively  affect  retention  and  cited  the  following: 

Existing  research  suggests  ‘(hostile)  deployments  increase  the  job 
satisfaction  and  resulted  with  a  higher  retention... results  supported 
previous  studies  that  found  deployment  had  a  positive  effect  on  job 
satisfaction  and  increased  the  level  of  personal  fulfillment,  thus, 
lead  to  the  decision  to  stay  more  on  the  military’.  (Gene,  2008) 

Britt,  Adler,  &  Bartone  (2001)  wrote  about  the  relationship  between  job 
satisfaction  and  hardiness.  They  found  hardiness  “associated  with  being 
engaged  in  meaningful  work  during  the  deployment,  which  was  strongly 
associated  with  deriving  benefits  from  the  deployment  months  after  it  was  over.” 
Similarly,  Duckworth  et  al.  (2007)  examined  another  personality  factor  called  grit 
as  an  indicator  of  retention  for  two  USMA  classes— 2004  and  2006.  The  studies 
showed  grit  as  a  predictor  of  retention  “above  and  beyond”  WCS  and  FFM 
Consciousness  (Duckworth  et  al.,  2007).  Commenting  on  the  Duckworth  et  al. 
(2007)  study,  Maddi  et  al.,  (2012)  noted,  “Cadets  who  were  retained  were  twice 
as  likely  to  have  higher  grit  scores  as  compared  with  cadets  who  were 
separated....”  Nonetheless,  in  training  or  deployed  environments,  officers  high  in 
grit  may  have  personal  or  familial  challenges  to  overcome  before  deciding  to 
make  the  military  a  career. 

A  naval  postgraduate  school  (NPS)  thesis  by  Gjurich  (1999),  expressed 
similar  thoughts.  Gjurich  cited  a  1989  document  that  attributed  officers'  reasons 
to  leave  military  service  to  “dissatisfaction  with  the  military  lifestyle,  civilian  career 
opportunities  and  security,  and  family  status”  (Gjurich,  1999).  Additionally, 
Gjurich’s  analysis  uncovered  variables  that  increased  retention,  namely, 
commissions  from  ROTC  programs  and  various  levels  of  postgraduate  education 
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(Gjurich,  1999).  We  infer  from  his  results  that  Academy  (e.g.,  Naval  Academy  or 
USMA)  graduates  tend  to  leave  the  service  at  a  higher  rate  than  ROTC  officers 
do. 

Gjurich’s  thesis  generated  a  viable  timeline  for  a  military  officer’s  career 
and  hypothesized  an  officer’s  first  five  years  as  a  period  of  mutual  study  and 
discovery  between  him  and  his  organization.  Between  the  fifth  and  tenth  year, 
the  officer  discovers  where  he  belongs  in  the  organization  before  he 
subsequently  decides  to  remain  or  depart  his  chosen  profession. 

The  aforementioned  research  admits  the  difficulty  of  any  model  to  predict 
retention  for  a  specific  individual  (Gjurich,  1999)  and  it  is  difficult  to  verify,  in  a 
timely  manner,  which  circumstances  motivated  an  officer’s  departure  (Gene, 
2008).  Moreover,  USMA’s  mission  statement,  perhaps  purposely,  fails  to  define 
the  duration  of  a  “...career  of  professional  excellence  and  service  to  the  Nation 
as  an  officer  in  the  United  States  Army.”  Perhaps  there  is  a  personality  factor 
positively  related  to  retention  that  we  could  explore.  Could  hardiness  be  that 
factor? 

D.  SUMMARY  OF  LITERATURE  REVIEW 

Table  1  contains  a  summary  of  the  research  documented  in  the  literature 
review.  Prior  research  findings  reveal  (1)  CEER,  hardiness,  and  WCS  predict 
leader  performance  (MD  grades,  MPS);  (2)  mental  ability  (SAT  Score)  inversely 
predicts  MPS.  Second,  hardiness  and  WCS  predict  CPS.  Thirdly,  grit  and 
hardiness  predict  retention.  Variables  which  previous  research  recognized  as 
successful  predictors  belong  to  the  following  categories: 

•  Leadership  Performance 

•  Outcome:  MD  grades,  MPS 

•  Predictors  (+):  CEER,  hardiness,  WCS 

•  Cadet  Cumulative  Performance 

•  Outcome:  CPS 

•  Predictors  (+):  hardiness,  WCS 
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Retention 


•  Outcome:  Years  retained 

•  Predictors  (+):  Grit,  hardiness 


Author 

Study 
Outcome  (s) 

Study  Predictor(s) 

Results 

Predictor  Correlates 

Clark 

(2007) 

Academic 

Success 

(GPA) 

Big  Five 

Academic  Success  Pos.  rel.  to  Conscienciousness 

Morningness  pos.  rel. 
to  Conscienc. 

Sleep  (Morningness- 
Eveningness) 

Not  rel.  to  Academic  Success 

Sleep  Quality 
(Actigraphy  Watches) 

Not  rel.  to  Academic  Success 

Bartone 

(2002) 

Leader 

Performance 
(Upperclass 
MD  grades) 

Spatial  Judgment 
battery 

Logical  reasoning 

battery 

battery 

Problem  solving  battery 
Big  Five  (NEO-  Analog) 

No  Cognitive  or  Personality  Predictors  rel.  to  Leader 
Performance 

CEER 

Leader  Performance  Pos.  rel.  to  CEER 

Bartone 

(2009) 

Leader 

Performance 

(Summer, 

Academic 

YR, 

Combined 
Upperclass 
MD  grades) 

Psychological 

Hardiness 

Summer  Leader  Performance  Pos.  rel.  to  hardiness 

No  correlation  with 
Gender,  CEER 

Academic  Leader  Performance  Pos.  rel.  to  hardiness 

Big  Five 

Summer  Leader  Performance  Pos.  rel.  to  Extraversion 

Academic  Leader  Performance  Pos.  rel.  to  Conscientiousnes 

Social  Judgement 

Summer  &  Academic  Performance  slightly  rel.  to  Soc. 
Judgment 

Gender 

Not  rel.  to  Summer  Leadership  Performance 

Conscienc., 

Agreeableness 

CEER 

Not  rel.  to  Summer  Leadership  Performance 

Negatively  rel.  to 
Neuroticism, 

Academic  Leader  Performance  Pos.  rel.  to  CEER 

Maddi 

(2012) 

Cadet 

Performance 

Score  &  1st- 

Year 

Retention 

(Freshmen) 

WCS 

Retention  Minimally  -Pos.  rel.  to  WCS 

Correl.  with  Hardiness 

CPS  Pos.  rel.  to  WCS 

Hardiness 

Retention  Moderately-Pos.  rel.  to  Hardiness 

CPS  Extremely-Pos.  rel.  to  Hardiness 

Gritt 

Retention  Extremely-Pos.  rel.  to  Grit 

WCS 

CPS  Pos.  rel.  to  Grit 

Correl.  with  Hardiness 

Bartone 

(2013) 

Military 
Performance 
Score  (Senior 
YR)  & 
Adaptability 
(Self-rate, 
Supervisor- 
rate  3  YR 
post-US  M  A) 

Psychological 

Hardiness 

MPS  Pos.  rel.  to  Hardiness-Control 

MPS  Pos.  rel.  to  Hardiness-Commitment 

MPS  Negatively  rel.  to  Hardiness-  Challenge 

Adaptability  (Self)  Pos.  rel.  to  Hardiness-Control, 

Adaptability  (Supervisor)  Pos.  rel.  to  Hardiness-Control 

Adaptability  (Supervisor)  not  rel.  to  Hardiness-Commitment, 
Challenge 

SAT  Score 

MPS  Negatively  rel.  to  SAT 

WCS 

MPS  Pos.  rel.  to  WCS 

Table  1 .  Literature  Review  Summary 
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METHODS  USED 


A.  DATA  DESCRIPTION 

The  current  study  used  historic  data  obtained  from  OIR  for  cadet  year 
groups  2005,  2006  and  2007.  At  the  researcher’s  request,  and  in  concert  with 
previous  literature,  OIR  originally  released  42  variables.  OIR  did  not  release  data 
pertaining  to  the  FFM,  sleep  quality,  Morningness-Eveningness,  or  cognition 
(spatial  judgment,  logical  reasoning,  social  judgment,  and  problem  solving). 

The  researcher  separated  the  data  into  three  groups.  The  first  group 
consists  of  pre-admission  variables,  used  to  develop  a  hardiness  predictor;  the 
second  group  includes  post-admission  predictors  (including  hardiness)  for 
retention.  All  (remaining)  variables  identified  as  non-predictors  comprise  the  third 
group. 

We  discarded  10  of  the  original  variables,  bringing  the  useable  variable 
count  to  32.  In  particular,  we  explain  two  identifier-variables,  personal 
identification  number  (PIN)  and  “class  admitted  to.”  Each  observation  had  a  PIN 
for  the  protection  of  human  subjects.  “Class  admitted  to”  simply  communicated 
which  cohort  each  PIN  belonged.  These  two  variables  helped  align  data, 
ensuring  consistency,  throughout  the  data  compilation  process.  We  eventually 
discarded  the  other  eight  variables  because  of  either  redundancy  or  lack  of 
necessity.  However,  a  few  of  the  eight  variables  served  a  special  purpose. 

Specifically,  “graduation  date  from  USMA”  and  “years  of  service,”  verified 
information  gained  from  other  variables.  “Graduation  date”  and  “class  admitted 
to”  helped  identify  which  cadets  graduated  on  time,  (with  original  cohort).  We 
excluded  cadets  who  did  not  graduate  on  time  from  the  analysis.  The  reason  for 
this  was  to  guarantee  consistency  throughout  the  data  set.  Similarly,  “years  of 
service”  and  “active  duty  status”  validated  which  graduates  were  retained,  and  if 
so,  how  many  years  of  active  service  they  amassed.  The  researcher  decided 
that  the  dichotomous  criterion  variable  “active  duty  status”  sufficiently 
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communicated  retention,  as  well  as,  if  not  better  than,  the  numeric  variable 
“years  of  service.”  See  Appendix  C  for  a  by-variable  breakdown  of  variable  type 
and  number  of  levels  (or  range,  where  applicable). 

B.  DEMOGRAPHICS  OF  THE  ARCHIVE 

There  were  3,716  records  in  the  data  set,  of  which  587  were  females  and 
3,129  were  males.  Table  2  shows  self-reported  demographics,  including  race, 
gender,  and  graduation  status. 


YG 

Admits 

Gender 

RACE 

Sepe  rated 

Graduated 

F 

M 

A 

Al 

B 

C 

H 

O 

U 

2005 

1198 

193 

1005 

93 

10 

100 

899 

70 

18 

8 

257 

941 

2006 

1193 

200 

993 

79 

10 

72 

930 

80 

16 

6 

310 

883 

2007 

1325 

194 

1131 

95 

11 

64 

1038 

100 

10 

7 

300 

1025 

TOT 

3716 

587 

3129 

267 

31 

236 

2867 

250 

44 

21 

867 

2849 

A-  Asian,  Al-  American  Indian,  B-  Black,  C-  Caucasian,  H-  Hispanic,  O-  Other,  U-  Unknown 


Table  2.  Demographics  of  the  OIR  Original  Data 

C.  MISSING  VALUES 

Using  descriptive  statistics,  we  inspected  the  records  for  uniformity  and 
consistency  and  accounted  for  problematic  entries  (e.g.,  missing  data,  outliers). 
The  USMA  experience  spans  47  continuous  months;  yet,  a  number  of  cadets 
end  their  career  prematurely  (e.g.,  separation  from  the  academy)  or  delayed 
(e.g.,  graduating  in  a  different  cohort  due  to  academic  failures,  medical  reasons, 
etc.).  For  the  hardiness  data  set,  all  observations  (cadets)  began  and  ended 
their  USMA  career  together.  This  ensured  that  every  cadet  in  the  data  set  had 
the  same  opportunities  to  affect  their  final  academic,  military,  physical 
performance  score  as  their  peers.  An  example  showing  why  this  is  important 
follows: 


A  cadet  enters  USMA  with  YG  2005  as  a  Plebe.  Because  of 
academic  trouble,  USMA  holds  the  cadet  back  one  year,  forcing 
him  to  join  YG  2006.  The  cadet,  also  an  intercollegiate  athlete, 
receives  an  injury  a  year  later.  The  injury  results  in  training 
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absences  due  to  post-surgery  recovery.  However,  the  cadet 
possesses  potential  for  leadership  and,  based  on  the 
recommendations  of  his  cadet  and  Army  officer  Chain  of 
Command,  USMA  retains  him.  However,  the  cadet  never  catches 
up  to  YG  2006  peers  in  terms  of  credit  hours.  Thus,  he  graduates 
(and  commissions)  late,  in  December,  rather  than  May. 

Categorical  variables  with  entries  coded  as  “unknown”  or  “other”  created 
additional  problems  with  the  analysis.  If  an  “unknown”  or  “other”  categorical  level 
proved  significant  at  the  completion  of  analysis,  we  faced  the  challenge  of 
explaining  it.  Fortunately,  the  problematic  cases  were  few,  remedied  by  deletion 
from  the  data  set. 

Table  3  identifies,  for  the  hardiness  data  set,  the  number  of  cadets  who 
had  missing  scores.  Similarly,  Tables  4  and  5  document  problematic  entries  for 
the  two  retention  data  sets  (graduation  versus  separation;  active  duty  versus 
loss).  Reasons  for  the  missing  values  remain  unknown. 


Hardiness  Data  Set 

Original  data  set  entries:  YG  2005,  2006,  2007 

3716 

Entries  missing  hardiness  scores 

-138 

Entries  with  "unknown"  political  views 

-155 

Entries  with  "unknown"  race 

-17 

Entries  with  type  of  high  school  "other" 

-25 

Final  data  set  entries 

3381 

Table  3.  Problematic  Entries:  Hardiness  Data  Set 
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Graduation  versus  Separation  Data  Set 

Original  data  set  entries:  YG  2005,  2006,  2007 

3716 

Graduates  failing  to  graduate  on  time 

-149 

Entries  missing  hardiness  scores 

-138 

Entries  missing  cadet  performance  data 

-279 

APS,  MPS,  PPS,  or  CPS 

Final  data  set  entries 

3150 

Table  4.  Problematic  Entries:  Graduation  versus  Separation  Data  Set 


Retention  Data  Set  #2 

Original  data  set  entries:  YG  2005,  2006,  2007 

3716 

Separated  cadets 

-864 

Graduates  failing  to  graduate  on  time 

-139 

Entries  missing  hardiness  scores 

-99 

YG  2005,  2006  graduates  branched  aviation 

-319 

YG  2005,  2006  graduates  branched  med.  corps 

-17 

YG  2005,  2006  graduates  branched  "other" 

-21 

YG  2007  (<  6  years  in  service) 

-834 

Final  data  set  entries 

1423 

Table  5.  Problematic  Entries:  Active  Duty  versus  Loss  Data  Set 


D.  ANALYSIS  PROCEDURE 

1.  Hardiness  and  Simple  Linear  Regression 

In  the  hardiness  data  set,  we  have  a  mixture  of  categorical,  continuous 
numeric  and  integer  variables  (see  Appendix  C).  Furthermore,  our  response  is 
continuous,  leading  us  to  use  linear  regression.  Linear  regression  is  a  useful 
way  to  conduct  regression  analysis,  accounting  for  both  categorical  and  numeric 
inputs. 

First,  we  identified  variables  susceptible  to  multicollinearity  using 
correlation  tables  and/or  variance  inflation  factor  (VIF)  diagnostics.  Second,  we 
used  stepwise  regression  techniques  to  develop  a  hardiness  predictor.  Stepwise 
regression  systematically  adds  or  drops  predictors  at  each  step  depending  on 
which  reduces  the  akaike’s  information  criteria  (AIC)  the  most.  AIC  measures 
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model  quality,  although  it  does  not  guarantee  the  goodness  of  fit.  The  “forward” 
process  of  stepwise  finds  an  appropriate  model  between  the  null  (“main  effects”) 
model  and  the  full  model  while  the  “backwards”  process  works  in  the  opposite 
manner.  Both  routes  identify  the  same  significant  variables.  We  developed 
various  models  (main  effects  and  variable-interaction5)  recommended  by  the 
stepwise  method.  We  used  the  statistical  computing  software  “R,”  which 
contains  a  stepwise  function  with  options  for  the  forward  and  backward  methods. 

Third,  we  analyzed  the  difference  in  mean  response  for  each  predictor 
level  using  the  Kruskal  Wallis  test  statistic,  analysis  of  variance  (ANOVA)  and 
Tukey  comparison  test.  Lastly,  for  models  containing  significant  interaction  terms 
(“p-value,”  p<  .05),  we  created  interaction  plots.  Interaction  plots,  typically 
produced  for  two  interacting  categorical  variables  and  a  response  variable, 
showed  us  how  the  average  hardiness  of  one  variable  varied  as  the  other 
variable  changed. 

Additional  explanations  and  mathematical  formulae  are  located  in 
Appendix  E. 

2.  Retention  and  Logistic  Regression 

We  used  generalized  linear  models  (GLM)  to  predict  retention.  Retention, 
a  binary  outcome  with  binomially  distributed  errors,  requires  a  function  to  assign 
a  number  value  to  the  two  response  values  “YES”  and  “NO.”  If  we  restrict  the 
domain  to  (0,  1),  we  find  a  useful  way  to  obtain  probabilities  for  predicting  our 
response.  We  associate  a  probability  close  to  zero  with  the  response  “NO”  and 
probabilities  near  one  with  “YES.”  See  Appendix  F  for  additional  details  and 
mathematical  formulation. 

After  fitting  several  of  the  GLMs  (including  stepwise  and  clog-log  link),  we 
inspected  the  significance  levels  (p-values)  of  the  variables.  We  then 

5  For  example,  main  effects:  Hardiness  =  gender  +  race  +  parents’  degree ;  variable- 
interaction:  Hardiness  =  (gender)(parents’  degree)+(race)(parents’  service)+iog(APSC)+ . . . 
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constructed  a  generalized  additive  model  (GAM)  with  smoothing  functions 
applied  to  the  numeric  predictors  before  visualizing  the  partial  residual  plots  to 
determine  which  transformation(s)  to  make,  if  any.  After  making  the  necessary 
transformations,  we  fit  additional  GLMs  and  compared  their  performance  by 
using  ANOVA.  Next,  we  developed  a  confusion  matrix  to  see  how  accurately  our 
model  classified  the  response  probabilities. 

Lastly,  we  performed  cross  validation  (CV)  to  assess  over-fitting,  “a 
situation  when  the  model  requires  more  information  than  the  data  can  provide” 
(Starkweather,  2013).  CV  randomly  divides  the  data  into  a  specified  number 
(e.g.,  k=^0)  of  groups.  In  CV,  we  use  /c-1  groups  as  a  subset  of  the  data  and  call 
it  the  “training  set.”  We  term  the  remaining  group,  the  “test  set.”  CV  generates  a 
GLM  from  the  training  set  then  uses  the  test  set  to  assess  its  prediction 
accuracy.  An  “R”  software  function,  called  “cv.glm,”  iterates  CV  k- 1  times,  using 
a  different  test  set  each  time,  to  compute  the  cross-validation  estimate. 
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RESEARCH  RESULTS 


A.  HARDINESS  MODEL  1 
1.  Multicollinearity 

We  explored  the  relationship  between  hardiness  and  the  remaining  18 
pre-admission  variables  using  the  statistical  package  “R.”  We  inspected  the 
variables  for  sources  of  multicollinearity.  Table  6  contains  the  correlations 
between  hardiness  (hrdns2)  and  pre-admission  variables. 


gend 

race 

f.deg 

m.deg 

poliv 

pGrad 

pServ 

typ.hs 

WCS 

ceer 

pae 

els 

eas 

aas 

fas 

hrdns2 

gend 

1.00 

race 

0.06 

1.00 

f.deg 

-0.01 

-0.01 

1.00 

m.deg 

0.00 

0.04 

0.13 

1.00 

poliv 

-0.10 

-0.10 

0.03 

-0.06 

1.00 

pGrad 

0.04 

-0.04 

0.09 

-0.04 

0.07 

1.00 

pServ 

0.03 

0.03 

0.07 

0.00 

0.05 

0.29 

1.00 

typ.hs 

-0.04 

0.00 

0.02 

-0.03 

0.06 

0.00 

-0.03 

1.00 

WCS 

-0.01 

0.06 

-0.06 

-0.04 

-0.09 

0.01 

0.06 

0.06 

1.00 

ceer 

-0.01 

0.07 

-0.06 

-0.03 

-0.09 

0.00 

0.05 

0.07 

0.89 

1.00 

pae 

0.01 

-0.10 

-0.01 

-0.01 

0.02 

0.05 

0.05 

0.00 

0.13 

-0.13 

1.00 

els 

-0.02 

0.03 

0.00 

-0.03 

-0.01 

0.01 

0.02 

0.02 

0.35 

-0.06 

0.16 

1.00 

eas 

0.03 

0.00 

0.00 

-0.02 

-0.05 

-0.01 

-0.02 

0.03 

0.36 

0.12 

-0.06 

0.66 

1.00 

aas 

-0.04 

0.03 

0.00 

-0.03 

0.03 

0.02 

0.05 

-0.01 

0.04 

-0.27 

0.27 

0.66 

-0.09 

1.00 

fas 

-0.10 

0.05 

-0.01 

0.03 

-0.03 

-0.01 

0.01 

0.04 

0.35 

0.29 

0.03 

0.24 

0.10 

0.00 

1.00 

hrdns2 

-0.09 

0.02 

-0.01 

-0.01 

0.04 

0.02 

0.01 

-0.01 

-0.04 

-0.08 

0.06 

0.06 

0.07 

0.02 

0.02 

1.00 

Table  6.  Hardiness  Correlation 


We  saw  no  significant  correlation  between  hardiness  and  any  of  the 
numeric  variables.  However,  we  noticed  strong  correlation  between  several  of 
the  pre-admission  variables  (WCS  versus  CEER,  r  =  0.89;  EAS  versus  CLS,  r  = 
0.66,  AAS  versus  CLS,  r  =  0.67).  The  correlations  made  sense  because  WCS 
and  CLS  relate  in  the  following  way: 

WCS=  (6  x  CEER)  +  (3  x  CLS)  +  (PAE  SCORE) 

CLS=  (EAS+  AAS+  FAS)  /  3 
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Strong  correlations  resulted  in  the  removal  of  WCS  and  CLS  from  the  data 
set.  Additionally,  we  discovered  a  strong  correlation  between  the  categorical 
variables  recruited  athlete  and  played  sport  (playedsportl )  (r  =  0.86).  We 
removed  recruited  athlete  because,  when  given  the  choice,  actual  participation  in 
intercollegiate  athletics  interested  the  researchers  more  than  recruitment. 
Moreover,  retaining  “playedsportl”  accounts  for  cadets  who  “walk-on”  to 
intercollegiate  athletic  teams. 

We  completed  variance  inflation  factor  (VIF)  diagnostics,  shown  in  Table 
7,  to  identify  remaining  multicollinearity.  All  VIF  values  fall  below  ten,  suggesting 
a  lack  of  significant  multicollinearity. 


Variable 

GVIF 

Df 

GVIFA(1/(2*Df)) 

gender 

1.07 

1 

1.03 

race 

1.32 

6 

1.02 

usmaps 

1.33 

1 

1.15 

playedsportl 

1.47 

1 

1.21 

ceer 

1.59 

1 

1.26 

pae 

1.16 

1 

1.08 

eas 

1.11 

1 

1.05 

aas 

1.39 

1 

1.18 

fas 

1.13 

1 

1.06 

f.degree 

1.11 

2 

1.03 

m. degree 

1.09 

2 

1.02 

poliview 

1.18 

5 

1.02 

pGrad 

1.17 

3 

1.03 

pServ 

1.20 

3 

1.03 

Table  7.  Variance  Inflation  Factor,  Main  Effects  Model 

2.  Main  Effects  Linear  Model  and  Stepwise  Regression 

We  created  a  linear  model  with  the  remaining  variables.  The  proportion  of 
variation  in  the  response  explained  by  these  variables  was  extremely  low  ( R2= 
0.04).  Nevertheless,  we  identified  the  most  significant  variables,  listed  in  Table 
8,  and  proceeded  to  develop  a  hierarchical  (stepwise)  regression  model. 
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The  stepwise  regression  model  found  the  same  significant  variables  as 
those  shown  in  Table  8.  However,  we  did  not  see  an  improvement  in  the 
coefficient  of  multiple  determination  (R2=. 03).  Notice  also,  that  several  of  the 
variables  have  coefficient  estimates  at  or  near  zero  indicating  their  weakness  in 
predicting  hardiness. 


Variables 

Estimate 

Std.  Error 

t  value 

Pr(>|t|) 

(Intercept) 

1.826 

0.153 

11.943 

<.001 

genderMale 

-0.079 

0.014 

-5.471 

<.001 

playedsportl 

-0.029 

0.014 

-2.141 

0.032 

ceer 

-0.001 

0 

-5.422 

<.001 

pae 

0 

0 

3.879 

<.001 

eas 

0 

0 

4.087 

<.001 

m.degreeGradSchool 

0.040 

0.018 

2.177 

0.030 

Table  8.  Significant  Hardiness  Predictors,  Stepwise  Regression 

Appendix  G  contains  a  pairs-plot  of  the  significant  predictors  (Table  8)  and 
hardiness.  Consistent  with  the  regression  results,  the  plot  reveals  no  apparent 
relationships. 

3.  Kruskal-Wallis  and  Tukey  Mean  Comparison  Test 

Residual  plots,  also  located  in  Appendix  G,  did  not  show  any  violation  of 
assumptions.  We  used  the  Kruskal-Wallis  test  to  compare  the  difference  in 
average  hardiness  across  the  significant  (categorical  predictor)  levels  to  confirm 
this  (see  Table  9).  The  only  predictor  whose  average  hardiness  differed  between 
levels  turned  out  to  be  gender.  Inspection  of  the  hardiness  values  revealed  that 
a  higher  percentage  of  males  achieved  low  (  <1.5)  hardiness  scores  while  a 
higher  percentage  of  females  achieved  high  (  >  1.5)  hardiness  scores.  A  strip 
chart  visualizes  this  occurrence,  shown  in  Figure  2. 
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Total  Hardiness 


Variable 

chi-squared 

df 

,  p-value 

gender 

28.001 

1 

<.001 

playedsportl 

0.038 

1 

0.847 

typehs 

3.876 

4 

0.423 

f.  degree 

1.879 

2 

0.391 

m.  degree 

5.505 

2 

0.064 

Table  9.  Kruskal-Wallis  Test:  Hardiness  versus  Categorical  Predictors 


Hardiness  by  Gender 


Figure  2.  Hardiness  versus  Gender 
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The  Tukey  test  for  determining  differences  between  means  confirmed  that 
the  average  hardiness  between  males  and  females  differed.  However,  while 
significant,  the  effect  size  was  negligible  (p  <  .001).  Table  10  shows  the 
difference  (in  means)  and  confidence  interval  information. 


95%  family 

f-wise  confidence  level 

Lower 

Upper 

Difference 

p-vale  adjusted 

-0.101 

-0.044 

-0.072 

<.001 

Table  10.  Tukey  Comparison  of  Average  Hardiness  by  Gender 

4.  Stepwise  Regression  Linear  Model  with  Interaction 

We  conducted  similar  procedures  for  a  stepwise  linear  model  with 
interaction  between  the  terms.  Interestingly,  several  ‘new’  predictors  showed  as 
significant,  in  addition  to  those  with  interaction.  See  Table  11  for  the  significant 
predictors  from  this  model.  In  addition,  we  see  our  first  improvement6  to  the 
coefficient  of  determination  (R2= 0.053)  and  to  the  beta  “estimates”  in  Table  1 1 . 


Va  riables 

Estimate 

Std.  Error 

t  value 

Pr(>|t|) 

genderMale 

-0.261 

0.087 

-3.006 

0.003 

ceer 

-0.002 

0.001 

-2.948 

0.003 

fas 

0.012 

0.005 

2.479 

0.013 

pae 

-0.002 

0.001 

-2.420 

0.016 

typehsPublic 

6.994 

3.453 

2.025 

0.043 

typehsPriv.Rel 

7.247 

3.474 

2.086 

0.037 

fas:typehsPublic 

-0.010 

0.005 

-2.056 

0.040 

f.degreeHighSchool 

-0.141 

0.030 

-4.706 

<.001 

genderMale:f.degreeHighSchool 

0.149 

0.033 

4.560 

<.001 

pae:aas 

0 

0 

3.016 

0.003 

fas:typehsPriv.Rel 

-0.010 

0.005 

-2.126 

0.034 

Table  1 1 .  Significant  Hardiness  Predictors,  Stepwise  with  Interaction 


6  We  see  beta  coefficients  >  1  as  an  improvement  over  beta  coefficients  <  1. 
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Type  of  high  school  (public  and  private  religious)  appears  positively 
related  to  hardiness  and  is  the  first  variables  with  coefficient  estimates  greater 
than  one.  However,  the  relationship  between  public  school  and  hardiness  is 
suspicious  since  the  population  is  predominantly  educated  in  the  public  school 
system  (only  658  out  of  3,381  data  points  correspond  to  private  high  school 
students). 

Interaction  between  faculty  appraisal  score  (FAS)  and  public  high  school 
or  private  religious  high  school  barely,  yet  inversely,  relate  to  hardiness. 

Next,  for  both  genders,  less-educated  fathers  appear  to  decrease 
hardiness  while  less-educated  fathers  of  male  cadets  relate  positively  to 
hardiness. 

5.  Analysis  of  Variance  between  Models 

We  conducted  ANOVA  on  the  two  stepwise  regression  models,  (Table  8 
and  Table  11)  and  discovered  that  the  “full”  (more  complicated,  stepwise  with 
interaction)  model  performed  better  than  the  reduced  model  (no  interaction 
terms).  See  Table  12. 


Model  1 

:  hrdns2  ~  stepwise 

Model  2:  hrdns2  ~  stepwise  with  interaction 

Model 

Res.Df 

RSS 

Df 

Sum  of  Sq 

Pr(>Chi) 

1 

3372 

306.72 

2 

3342 

298.59 

30 

8.1259 

<.001 

Table  12.  Hardiness  Stepwise  Model  ANOVA  Results 
In  summary,  we  achieved  coefficient  estimates  less  than  one,  except  for 
private  religious  and  public  high  school  types,  in  the  stepwise  interaction  model. 
Nearly  all  predictors  contributed  poorly  to  total  hardiness.  However,  the  stepwise 
model  with  interaction  performed  better,  signifying  the  apparent  relationship  of 
several  predictors  to  hardiness,  when  combined  (interaction). 
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B.  HARDINESS  MODEL  2 


We  created  a  second  hardiness  model  to  determine  the  predictive  power 
of  hardiness  within  a  target  group  of  the  data.  On  the  premise  that  athletes 
undergo  trials  similar  to  those  experienced  in  the  military  lifestyle,  we  chose  the 
USMA  varsity  football  team.  Table  13  shows  a  summary  of  the  data  ( N=  146). 


USMAPS 

AVERAGE  HARDINESS  (H) 

FATHER'S  DEGREE  LEVEL 

Yes 

No 

Total  H 

H-Com. 

H-Con. 

H-Cha. 

High  Sch. 

College 

Graduate 

36 

110 

1.97 

2.08 

2.07 

1.76 

53 

65 

28 

RACE 

MOTHER'S  DEGREE  LEVEL 

Asian 

Am. Indian 

Afr.Amer 

Cauc. 

Hisp. 

Unk. 

High  Sch. 

College 

Graduate 

1 

1 

30 

Ill 

2 

1 

54 

73 

19 

POLITICAL  VIEWS 

TYPE  OF  HIGH  SCHOOL 

Far  Left 

Liberal 

Moder. 

Conserv. 

Far  Right 

Public 

Priv-Relig. 

Priv-Gen. 

Priv-Mil. 

0 

20 

65 

61 

0 

120 

21 

4 

1 

PARENTS'  MILITARY  SERVICE 

PARENT  USMA  GRADUATES 

Both 

Father 

Mother 

Neither 

Both 

Father 

Mother 

Neither 

12 
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88 


141 


Table  13.  Data  Summary,  USMA  Football  Player  Data  Set 

1.  Multicollinearity 

We  constructed  a  correlation  table  for  the  full  set  of  variables  for  the 
football  team  data  set  in  order  to  identify  potential  sources  of  multicollinearity 
(see  Table  14). 
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race 

usmaps 

f.deg 

m.deg 

polivw 

pgrad 

pserv 

typ.hs 

WCS 

ceer 

pae 

els 

eas 

aas 

fas 

cm2 

co2 

ch2 

hrdns2 

race 

1.00 

usmaps 

-0.29 

1.00 

f.deg 

-0.08 

0.16 

1.00 

m.deg 

-0.02 

0.13 

0.20 

1.00 

polivw 

-0.16 

0.06 

0.15 

0.19 

1.00 

pgrad 

-0.05 

0.02 

0.07 

0.06 

0.22 

1.00 

pserv 

0.02 

0.06 

0.24 

0.11 

0.05 

0.23 

1.00 

typ.hs 

-0.11 

0.04 

0.07 

0.02 

0.09 

-0.01 

-0.01 

1.00 

WCS 

0.17 

-0.38 

-0.01 

-0.12 

-0.17 

-0.06 

0.05 

0.16 

1.00 

ceer 

0.26 

-0.42 

-0.04 

-0.16 

-0.19 

-0.04 

0.05 

0.17 

0.91 

1.00 

pae 

-0.26 

0.08 

-0.02 

-0.05 

-0.03 

0.08 

0.09 

0.09 

0.10 

-0.09 

1.00 

els 

0.01 

-0.07 

0.06 

0.08 

-0.01 

-0.12 

-0.06 

0.06 

0.43 

0.08 

-0.01 

1.00 

eas 

-0.06 

0.01 

0.07 

0.03 

-0.06 

-0.11 

-0.06 

0.10 

0.46 

0.17 

-0.01 

0.87 

1.00 

aas 

0.07 

-0.07 

0.02 

0.13 

0.12 

0.03 

0.04 

-0.06 

-0.07 

-0.24 

-0.01 

0.36 

-0.10 

1.00 

fas 

0.13 

-0.25 

-0.05 

-0.02 

-0.04 

-0.18 

-0.12 

-0.02 

0.27 

0.19 

-0.03 

0.28 

0.15 

-0.08 

1.00 

cm2 

0.20 

-0.01 

-0.01 

-0.03 

-0.17 

0.00 

0.03 

0.06 

0.04 

-0.04 

0.13 

0.15 

0.09 

0.13 

0.08 

1.00 

co2 

0.03 

0.13 

0.09 

0.00 

0.05 

0.11 

0.00 

0.12 

-0.05 

-0.12 

0.11 

0.18 

0.18 

0.02 

0.05 

0.54 

1.00 

ch2 

0.01 

-0.14 

0.01 

0.17 

0.19 

-0.03 

0.03 

-0.07 

-0.03 

-0.05 

0.10 

-0.01 

-0.03 

0.07 

-0.05 

0.10 

-0.01 

1.00 

hrdns2 

0.11 

-0.03 

0.04 

0.09 

0.06 

0.03 

0.03 

0.03 

-0.02 

-0.10 

0.16 

0.14 

0.10 

0.11 

0.03 

0.74 

0.67 

0.63 

1.00 

Table  14.  Hardiness  Correlation,  Football  Team 

Again,  we  see  WCS  significantly  correlated  (r  >  |0.5|)  with  CEER  and  CLS 
significantly  correlated  with  EAS.  Interestingly,  the  second  hardiness  model  did 
not  show  significant  correlation  between  CLS  and  AAS  for  the  football  players7, 
as  did  hardiness  model  1.  Lastly,  as  expected,  the  hardiness  facets  correlate 
with  total  hardiness.  We  removed  WCS,  CLS,  and  the  hardiness  facets  from  the 
model  to  minimize  the  effects  of  multicolinearity. 

2.  Main  Effects  Linear  Model  and  Stepwise  Regression 

After  comparing  simple  linear  and  stepwise  regression  models,  we  noticed 
stepwise  regression  produced  a  better  coefficient  of  determination  (R2=0. 1 9  and 
R2a= 0.1 1 )  and  four  significant  terms  at  the  p<.05  level;  however,  only  high  school 
type  (public/  private)  and  mothers  degree  (graduate)  exceeded  0.1.  Each  of  the 


significant  variables  related  positively  to  hardiness.  High  school  type  became  the 


7  We  did  not  include  gender  because  only  males  participate  in  varsity  football.  Additionally, 
we  saw  previously  that  the  difference  in  hardiness  between  genders  is  negligible. 
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strongest  predictor  of  hardiness  in  the  stepwise  method.  See  Table  15  for  the 
summary  of  significant  variables. 


Variable 

Estimate 

Std.  Error 

t  value 

Pr(>  1 1 1 ) 

typehsPriv.Rel 

0.492 

0.163 

3.014 

0.003 

pae 

0.001 

0 

2.977 

0.003 

typehsPublic 

0.348 

0.152 

2.295 

0.023 

m.degreeGradSchool 

0.172 

0.077 

2.239 

0.027 

Table  15.  Significant  Football  Hardiness  Predictors,  Stepwise  Regression 

3.  Interaction  Model  and  Stepwise  Regression 

We  created  another  model  with  interaction  terms  and  used  the  stepwise 
method  to  identify  the  significant  variables.  This  model  quality  appeared  to 
improve  greatly  (R2= 0.97,  R2a=0.70).  We  discovered  a  great  number  of 
significant  terms,  36  at  the  p<.05  level;  however,  only  11  of  the  coefficient 
estimates  exceeded  the  value  1.0.  Rather  than  listing  every  significant 
interaction  term,  we  list  the  variable  names:  mothers’  degree  level,  fathers’ 
degree  level,  political  view,  type  of  high  school,  race,  and  USMAPS  attendance. 
T able  1 6  shows  the  summary  output  for  the  significant  terms. 
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Variable 

Estimate 

Std.  Error 

t  value 

Pr(>|t|) 

m.degreeHighSchool 

-15.880 

3.747 

-4.237 

0.001 

poliviewL 

-15.600 

6.776 

-2.303 

0.036 

f.degreeGradSchoohtypehsPriv.Rel 

-6.525 

2.381 

-2.741 

0.015 

raceAf.  Amer:  pol  iviewL 

-1.819 

0.684 

-2.660 

0.018 

raceAf.  Amer:  pol  iviewM 

-1.207 

0.437 

-2.761 

0.015 

poliviewL:  pServFather 

-1.205 

0.558 

-2.158 

0.048 

usmapsYes:poliviewM 

1.063 

0.378 

2.814 

0.013 

usmapsYes:  poliviewL 

3.094 

0.845 

3.660 

0.002 

raceHisp 

3.619 

1.167 

3.100 

0.007 

raceAf.  Amer:  pServNeither 

6.235 

2.167 

2.877 

0.012 

raceAf.  Amer:  pServFather 

7.345 

2.280 

3.221 

0.006 

Residual  standard  error:  0.167  on  15  degrees  of  freedom 

Multiple  R-squared:  0.9687,  Adjusted  R-squared:  0.6978 

F-statistic:  3.576on  130and  15  DF,  p-value:  0.00357 

Table  16.  Significant  Variables,  Hardiness  Model  2,  Stepwise  Interaction 

Table  16  indicates  less-educated  mothers,  liberal  and  moderate  political 
views  (when  present  in  African  Americans  players),  and  educated  fathers  with 
cadets/  males  who  attended  private  religious  school,  negatively  influence 
hardiness.  On  the  other  hand,  moderate  and  liberal  political  views,  when 
combined  with  USMAPS  attendance  related  positively  to  hardiness.  Finally, 
African  Americans  with  military-fathers,  African  Americans  with  non-military 
parents,  and  Hispanics  relate  positively  to  total  hardiness.  The  model  suggests 
African  American  football  players  with  military-fathers  are  more  likely  to  display 
hardiness. 

4.  Hardiness,  Race  and  Parents’  Military  Service 

After  inspection,  race  (Hispanics)  related  less  to  hardiness  than  the 
summary  output  from  Table  16  showed.  We  believe  the  small  number  of 
Hispanic  football  players — only  two  of  the  146  observations — inflated  their 
average  hardiness. 


48 


Hardiness  by  Race 


Figure  3.  Race  versus  Hardiness,  Football  Players 

Caucasians  and  African  Americans  dominate  the  population  of  football 
players.  Caucasians,  as  the  majority,  have  the  highest  hardiness  average, 
followed  by  African  American  players. 

Next,  we  created  an  interaction  plot  to  see  how  the  average  hardiness 
score  changed  between  the  races  as  parents’  service  status  changed.  See 
Figure  4.  The  effect  of  parents’  military  service  on  hardiness  is  different  for 
African  Americans  than  it  is  for  Caucasians.  Caucasian  football  players’ 
hardiness  is  highest  when  both  parents  serve/  served  in  the  military  while  African 
American  players’  hardiness  is  highest  when  only  the  father  served. 
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Figure  4.  Interaction  Plot:  Race,  Parents’  Service  and  Hardiness 

Only  one  African  American  player  had  a  military-mother — that  player’s 
hardiness  score  was  1.4.  That  is  the  reason  for  the  deep  trough  in  the  African 
American  average  hardiness  over  “Mother.”  If  the  data  contained  more  African 
American  football  players  with  military-mothers,  the  interaction  plot  could  change. 
The  other  races  do  not  appear  in  the  interaction  plot  due  to  lack  of  representation 
in  the  sample  (see  single  dots  in  Figure  3  over  American  Indian,  Asian,  and 
unknown). 
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5.  Hardiness,  USMAPS  Attendance  and  Political  View 

Next,  we  investigated  USMAPS  attendance,  political  views  and  hardiness. 
The  average  hardiness  scores  for  non-USMAPS  and  USMAPS  players  are  1 .97 
and  1 .95,  respectively  (see  Figure  5). 


Hardiness  by  USMAPS  Attendance 


No  Yes 

Figure  5.  USMAPS  versus  Hardiness,  Football  Players 

“Far  left”  or  “far  right”  (political  view)  occurred  in  only  three  of  the  146 
entries,  thereby  making  the  original  interaction  plot  difficult  to  interpret.  Another 
source  of  difficulty  came  from  the  lack  of  “far  left”  USMAPS  attendees. 
Therefore,  we  imputed  “far  left”  as  liberal  and  “far  right”  as  conservative.  The 
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interaction  plot  shown  in  Figure  6  indicates  the  effect  of  political  views  on 
hardiness  is  generally  the  same  USMAPS  players  and  non-USMAPS  players.8 
However,  conservative  (C)  USMAPS  players  tend  to  have  lower  average 
hardiness. 


Interaction  Plot 


C  L  M 

Political  View 

Figure  6.  Interaction  Plot:  USMAPS,  Political  View  and  Hardiness 


8  C:  conservative,  L:  liberal,  M:  moderate  political  views 
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6.  Hardiness,  High  School  Type  and  Fathers’  Degree  Level 

Lastly,  we  investigated  the  interaction  between  high  school  type  and 
fathers’  degree  level.  Figure  7  shows  the  public  and  private  religious  schools 
dominate  the  population  (average  hardiness,  1.97  and  2.03,  respectively). 

Hardiness  by  High  School  Type 


Figure  7.  High  School  Type  versus  Hardiness,  Football  Players 


We  created  an  interaction  plot  and  noticed  average  hardiness  does  not 
change  for  public  school  players  as  fathers’  degree  level  changes.  However,  for 
players  educated  in  private  religious  high  schools,  average  hardiness  decreases 
as  fathers’  education  level  increases.  Although,  interpretation  of  private-general 
schools  is  limited  because  of  sample  size,  we  notice  it  favors  the  behavior  of 
private  religious  schools. 
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Interaction  Plot 


College  GradSchool  HighSchool 

Fathers’  Degree  Level 


Figure  8.  Interaction  Plot:  High  School,  Father’s  Degree  and  Hardiness 

C.  RETENTION  MODEL  1 :  GRADUATION  VERSUS  SEPARATION 
1.  Multicollinearity 

Retention  model  1  aims  to  predict  the  likelihood  of  graduation  (vice 

separation)  from  West  Point.  First,  we  inspected  the  variables  for  any  presence 

of  multicollinearity.  Table  17  displays  the  correlations  between  the  12  numeric 

variables  (including  our  response,  Graduation  Status).  We  denoted  our  response 

variable  “Gstat.”  We  did  not  see  any  significant  correlation  between  retention 

and  any  of  the  numeric  variables.  However,  we  observed  significant  correlation 

between  two  sets  of  variables  (APSC  versus  CEER,  r  =  0.61;  APSC  versus 
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MPSC,  r  =  0.54).  We  were  not  surprise  in  the  correlations  since  APSC,  CEER, 
and  MPSC  are  strong  academic  indicators  of  performance.  The  correlation 
results  led  to  the  removal  of  APSC  from  the  model. 

Note:  We  included  a  pairs  plot  of  the  significant  variables  in  Appendix  G. 


cm2 

co2 

ch2 

ceer 

pae 

eas 

aas 

fas 

apse 

mpsc 

ppsc 

m.deg 

f.deg 

typ.a 

Gstat 

cm2 

1.00 

co2 

0.49 

1.00 

ch2 

0.16 

0.02 

1.00 

ceer 

-0.04 

-0.17 

0.01 

1.00 

pae 

0.05 

0.07 

0.02 

-0.12 

1.00 

eas 

0.07 

0.06 

0.02 

0.11 

-0.07 

1.00 

aas 

0.01 

0.04 

-0.02 

-0.26 

0.26 

-0.09 

1.00 

fas 

0.07 

0.02 

-0.05 

0.28 

0.02 

0.10 

0.00 

1.00 

apse 

-0.01 

-0.09 

-0.04 

0.61 

-0.03 

0.08 

-0.14 

0.28 

1.00 

mpsc 

0.08 

0.04 

-0.05 

0.26 

0.02 

0.17 

-0.05 

0.26 

0.54 

1.00 

ppsc 

0.08 

0.03 

0.03 

0.13 

0.30 

0.00 

0.16 

0.16 

0.45 

0.44 

1.00 

m.deg 

0.02 

0.01 

0.00 

-0.04 

-0.02 

-0.01 

-0.03 

0.03 

-0.06 

-0.03 

-0.04 

1.00 

f.deg 

0.00 

-0.01 

-0.01 

-0.06 

0.00 

0.00 

0.00 

-0.02 

-0.06 

-0.02 

-0.03 

0.16 

1.00 

typ.a 

-0.02 

0.03 

-0.03 

-0.22 

0.21 

-0.16 

0.34 

-0.02 

-0.13 

-0.10 

0.10 

-0.01 

-0.01 

1.00 

Gstat 

0.03 

0.03 

-0.05 

0.10 

0.02 

0.04 

-0.01 

0.09 

0.41 

0.28 

0.31 

-0.04 

-0.05 

-0.05 

1.00 

Table  17.  Retention  (Graduation  versus  Separation)  Correlation 


2.  Logit-link  GLM  and  Stepwise  Regression 

Next,  we  created  a  GLM  using  the  logit-link  function  and  then  ran  stepwise 
regression.  The  stepwise  method  yielded  the  same  variables  as  the  main  effects 
model  but  the  ANOVA  test  showed  the  stepwise  with  a  higher  residual  deviance. 
However,  the  stepwise  model  yielded  a  better  (lower)  AIC  value,  so  we  retained 
it.  Tables  18  and  19  show  the  significant  variables  from  the  stepwise  (GLM  1) 
and  the  ANOVA  test,  respectively. 
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Variable 

Estimate 

Std.  Error 

z  value 

Pr(>|z|) 

(Intercept) 

-5.445 

0.601 

-9.061 

<.001 

mpsc 

1.145 

0.138 

8.279 

<.001 

ppsc 

1.778 

0.149 

11.904 

<.001 

ch2 

-0.288 

0.099 

-2.911 

0.004 

typ.athVar 

-0.358 

0.146 

-2.456 

0.014 

f.degreeHighSchool 

-0.253 

0.114 

-2.216 

0.027 

Null  deviance:  2914.5  on  3149  degrees  of  freedom 

Residual  deviance:  2493.5  on  3141  degrees  of  freedom 

AIC:  2511.5 

Table  18.  Significant  Retention  Predictors,  Stepwise  GLM  1 


Model  1:  status  ~  typ.ath  +  m. degree  +  f.degree  +  cm2  + 

co2  +  ch2  +  ceer  +  pae  +  eas  +  aas  +  fas  +  mpsc  +  ppsc 

Model  2:  status  ~  typ.ath+f.degree+ch2+pae+mpsc+ppsc 

Model 

Res.Df 

RSS 

Df 

Sum  of  Sq 

Pr(>Chi) 

1 

3133.00 

2490.20 

2 

3141.00 

2493.50 

-8.00 

-3.24 

0.92 

Table  19.  GLM  1  Main  Effects  and  Stepwise  ANOVA  Results 

MPSC  and  PPSC  relate  positively  to  retention;  hardiness-challenge  (ch2), 
varsity  athletes  (typ.athVar),  and  father’s  degree  (high  school)  relate  negatively 
to  retention.  PPSC  is  the  strongest  indicator  of  retention  among  the  variables  in 
the  model. 

Next,  we  created  a  second  (main  effects)  GLM  (not  shown)  using  the  clog- 
log  link  to  see  if  it  was  better  than  the  logit-link  model,  but  it  was  not.  We 
observed  a  difference  in  a  deviance  of  37.5  between  logit  and  clog-log,  in  favor  of 
the  logit-linked  model. 

3.  Generalized  Additive  Models 

The  third  model  we  constructed  started  with  a  generalized  additive  model 
(GAM),  using  the  original  predictors  and  incorporating  a  smoothing  function  on 
the  numeric  variables.  We  plotted  the  partial  residual  terms  against  their 
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predictors  to  identify  variables  that  needed  transformation.  The  plots  in  Figure  9 
reveal  hardiness-challenge,  type  of  athlete,  and  father’s  degree  appear  linearly 
while  MPSC  and  PPSC  appeared  logarithmic.  Note:  the  other  variables 
appeared  linearly,  but  we  did  not  include  them  in  the  figures  to  save  space. 
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Figure  9.  Partial  Residuals  versus  Predictor  Plot,  GAM 


We  made  logarithmic  transformations  on  MPSC  and  PPSC,  fit  a  third 
GLM,  and  then  conducted  stepwise  regression  on  the  new  model.  We  witnessed 

an  improvement  (decrease)  in  AIC  and  found  an  additional  significant  variable, 
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namely,  PAE  score.  The  stepwise  from  our  third  GLM  (GLM  3)  turned  out  to  be 
our  best  model.  Table  20  contains  the  summary  statistics  for  the  stepwise  model 
with  the  transformed  variables.  Equation  (1 )  shows  the  fitted  model. 


Variable 

Estimate 

Std.  Error 

z  value 

Pr(>|z|) 

(Intercept) 

-5.739 

0.606 

-9.471 

<.001 

log(mpsc) 

3.307 

0.363 

9.102 

<.001 

log(ppsc) 

5.041 

0.416 

12.114 

<.001 

ch2 

-0.296 

0.100 

-2.955 

0.003 

typ.athVar 

-0.340 

0.147 

-2.321 

0.020 

f.degreeHighSchool 

-0.261 

0.115 

-2.257 

0.024 

pae 

-0.002 

0.001 

-1.990 

0.047 

Null  deviance:  2914.5  on  3149  degrees  of  freedom 

Residual  deviance:  2459.5  on  3141  degrees  of  freedom 

AIC:  2477.5 

Table  20.  Significant  Retention  Predictors,  Stepwise  GLM  3 


(7,=A+A*i+-+A-,*,-i  (1) 

n,  =  -5.739+3.307 +5.041Xlog(^)  -0.296^,  -0.340X„My„ 

0.26  .deg  ree(high. school)  0.002Xpae 

l  +  exp(/7,) 

The  fitted  model  (7))  shows  hardiness-challenge,  varsity  athletes,  father’s 

degree  (high  school),  and  PAE  relate  inversely  to  retention  (graduation).  Next, 
the  model  established  MPSC  and  PPSC  as  positive  contributors  to  retention. 
The  strongest  predictors  of  retention  are  MPSC  and  PPSC. 

4.  Confusion  Matrix,  Retention  Model  1 

A  confusion  matrix  helped  determine  how  well  stepwise  GLM  3  (Table  20) 
classified  the  responses.  We  used  the  following  predicted  probability  threshold: 
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£,■>0.77  retained  (graduated),  £,<0.77  separated.  This  threshold  maximized  the 

correct  classification  rate  and  minimized  the  incorrect  classification  rate.  Table 
21  shows  the  confusion  matrix. 


Raw  Numbers 

Observed  Value 

Graduate 

Separate 

Model 

Graduate 

2158 

443 

Predicted 

Separate 

250 

299 

Percentages 

Observed  Value 

Graduate 

Separate 

Model 

Predicted 

Graduate 

83% 

17% 

Separate 

46% 

54% 

Table  21 .  Graduation  versus  Separation  Confusion  Matrix 

Our  model  classified  83  percent  of  those  who  graduated  correctly  and 
classified  17  percent  of  the  graduates  incorrectly.  Additionally,  the  model 
classified  54  percent  of  those  who  separated  correctly  and  46  percent  of  those 
who  did  not  separate  incorrectly.  The  model  predicted  the  occurrence  of 
graduation  better  than  it  did  separation. 

5.  Cross-validation  Results 

Using  a  function  from  “R,”  we  obtained  a  CV  estimate  of  22  percent,  which 
signifies  the  percent  of  response  variables  we  misclassified. 

D.  RETENTION  MODEL  2  ACTIVE  DUTY  VERSUS  LOSS 

1.  Multicollinearity 

Similar  to  the  previous  models,  we  assessed  the  correlation  between  the 
variables  to  detect  any  significant  multicollinearity  ( r  >  |0.5|).  Note  the 
correlations  for  Retention  Model  2  in  Table  22. 
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typ.ath 

m.deg 

f.deg 

cm2 

co2 

ch2 

ceer 

pae 

eas 

aas 

fas 

apse 

mpsc 

ppsc 

babr 

A.stat 

typ.ath 

1.00 

m.deg 

0.04 

1.00 

f.deg 

-0.01 

0.13 

1.00 

cm2 

-0.01 

0.02 

0.01 

1.00 

co2 

0.04 

0.00 

0.00 

0.48 

1.00 

ch2 

-0.02 

0.04 

-0.01 

0.18 

0.02 

1.00 

ceer 

-0.25 

-0.04 

-0.04 

0.01 

-0.16 

0.03 

1.00 

pae 

0.24 

0.01 

0.01 

0.07 

0.08 

0.02 

-0.16 

1.00 

eas 

-0.17 

-0.04 

0.02 

0.08 

0.05 

0.03 

0.10 

-0.06 

1.00 

aas 

0.37 

-0.03 

0.01 

0.00 

0.05 

-0.02 

-0.29 

0.29 

-0.10 

1.00 

fas 

-0.01 

-0.01 

-0.02 

0.10 

-0.02 

-0.02 

0.31 

0.05 

0.09 

0.02 

1.00 

apse 

-0.14 

-0.07 

-0.04 

0.01 

-0.10 

-0.02 

0.64 

-0.08 

0.08 

-0.18 

0.31 

1.00 

mpsc 

-0.09 

-0.05 

0.00 

0.13 

0.08 

-0.02 

0.24 

0.01 

0.18 

-0.05 

0.26 

0.52 

1.00 

ppsc 

0.12 

-0.03 

-0.02 

0.11 

0.04 

0.05 

0.10 

0.32 

0.03 

0.16 

0.23 

0.38 

0.46 

1.00 

babr 

-0.03 

0.00 

-0.08 

0.05 

0.03 

0.07 

0.05 

0.03 

0.04 

-0.03 

0.02 

0.08 

0.02 

0.06 

1.00 

A.stat 

-0.08 

0.00 

-0.01 

0.01 

-0.03 

-0.02 

0.06 

-0.06 

0.06 

-0.09 

-0.03 

0.02 

0.06 

0.01 

-0.02 

1.00 

Table  22.  Retention  (active  duty  versus  loss)  Correlation 

Again,  APSC  showed  as  significantly  correlated  to  MPSC  and  CEER,  as 
shown  earlier.  Thus,  we  removed  APSC  from  the  model.  We  included  a  pairs 
plot  of  retention  status  and  several  predictors  in  Appendix  G. 

2.  Logit-link  GLM  and  Stepwise  Regression 

•  Note  (1):  The  following  results  continue  the  GLM  numbering  pattern 
used  from  Retention  Model  1 . 

•  Note  (2):  Retention  Model  2  attempts  to  predict  the  retention  of 
USMA  graduates  after  their  initial  five-year  commitment.  Because 
prior  research  suggests  military  officers  make  the  decision  to  leave/ 
stay  between  their  sixth  and  seventh  year  of  service,  we  excluded 
USMA  YG  2007  from  this  data  set  (see  Table  5,  section  1 1C). 

We  created  our  first  GLM  (GLM  4)  using  the  logit  link  and  compared  it  to 
the  model  generated  by  stepwise  regression  using  ANOVA.  Although  the  main 
effects  model  generated  a  lower  residual  deviance  (from  the  ANOVA),  we 
retained  the  stepwise  model  over  it  because  it  had  a  lower  AIC.  We  compared  a 
clog-log-link  model  (GLM  5,  not  shown)  against  the  stepwise  model,  but  we 
rejected  it.  The  clog-log  model  performed  inferior  to  the  logit-link.  Table  23  and 
24  show  the  significant  variables  from  the  stepwise  of  GLM  4  and  the  ANOVA, 
respectively. 
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Variable 

Estimate 

Std.  Error 

z  value 

Pr(>  |  z  | ) 

(Intercept) 

0.547 

0.454 

1.204 

0.228 

babrIN 

1.212 

0.282 

4.300 

<.001 

Null  deviance:  1903.1  on  1411  degrees  of  freedom 

Residual  deviance:  1795.1  on  1394  degrees  of  freedom 

AIC:  1831.1 

Table  23.  Significant  Retention  Predictors,  Stepwise  GLM  4 


1.  Main  Effects  Model:  status  ~  typ.ath  +  m. degree  +  f.degree  +  cm2  +  co2  +  ch2  +  ceer  + 
pae  +  eas  +  aas  +  fas  +  mpsc  +  ppsc  +  babr 


2.  Stepwise  Model:  status  ~  typ.ath  +  aas  +  babr 


Resid.  Df 

Resid.  De 

Df 

Deviance 

Pr(>Chi) 

1 

1381 

1788.9 

2 

1394 

1795.1 

-13.000 

-6.238 

0.937 

Table  24.  Main  Effects  and  Stepwise  GLM  4  ANOVA  Results 

Each  model  revealed  the  Infantry  basic  branch  (babr)  as  the  most 
significant  predictor.  However,  the  main  effects  model  also  showed  the  military 
police  branch  as  significant.  It  is  worth  noting  that,  although  not  significant  at  the 
p<  0.05  level,  both  the  main  effects  and  stepwise  models  generated  armor  and 
engineer  branches  at  p<0.07.  All  of  these  branches  related  positively  to  retention 
beyond  one’s  active  duty  service  obligation. 

3.  Generalized  Models  and  Stepwise  Regression 

We  attempted  to  create  a  third  GLM,  by  first  using  a  GAM  to  identify 
needed  variable  transformations.  However,  all  partial  residuals  versus  predictor 
plots  appeared  linearly.  We  included  a  plot  of  the  basic  branch  partial  versus 
residual  plot  in  Figure  10. 
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Figure  10.  Partial  Residuals  versus  Predictor  Plot,  GAM 

Using  the  summary  output  from  Table  23,  we  constructed  the  fitted  model. 
The  fitted  model,  equation  (2),  shows  infantry  as  the  strongest  indicator  of 
retention. 

Vi  =  A)  +  P\x\  +  •  •  ■ •  +  PP  -ixP-\ 

Vi  =0.547  +  1.21 2 Xhabr  ( ,NFA  ntyry) 

*  -  exp  07,  ) 

l  +  exp(7f)  ‘  (2) 

4.  Confusion  Matrix,  Retention  Model  2 

We  used  the  following  predicted  probability  threshold  for  our  confusion 
matrix:  ;r;. >0.60  retained  (graduated),  £,.<0.60  separated.  This  threshold 

maximized  the  correct  classification  rate  and  minimized  the  incorrect 
classification  rate.  Table  25  shows  the  confusion  matrix  for  stepwise  (GLM  4). 
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Raw  Numbers 

Observed  Value 

Active 

Loss 

Model 

Active 

507 

337 

Predicted 

Loss 

199 

369 

Percentages 

Observed  Value 

Active 

Loss 

Model 

Predicted 

Active 

60% 

40% 

Loss 

35% 

65% 

Table  25.  Active  versus  Loss  Confusion  Matrix 

Our  model  classified  60  percent  of  those  who  remained  on  active  duty 
correctly  and  classified  40  percent  incorrectly.  Furthermore,  our  model  classified 
65  percent  of  those  who  left  active  service  (loss)  correctly  and  35  percent 
incorrectly.  The  model  predicted  the  occurrence  of  loss  five  percent  better  than  it 
did  the  occurrence  of  remaining  on  active  duty. 

5.  Cross-validation  Results 

Using  “R,”  we  obtained  a  CV  estimate  of  39  percent,  which  signifies  the 
percent  of  response  variables  we  misclassified. 
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DISCUSSION  AND  IMPLICATIONS 


A.  HARDINESS  PREDICTION 

Results  from  the  first  hardiness  model  indicated  that  less-educated  but 
hard-working  fathers  (of  males)  and  highly  educated,  hard-working  mothers 
produce  children  higher  in  hardiness.  We  list  a  few  preliminary  results  from 
hardiness  model  1 : 

•  Less-educated  fathers  of  male  cadets  (interaction),  educated 
mothers,  public  high  schools,  and  private  religious  high  schools 
relate  positively  to  hardiness. 

•  Hardiness  is  unrelated  to  gender  and  CEER.  Bartone  et  al.  (2009) 
had  similar  findings  and  Matthews  et  al.  (2000)  state  “...until 
stronger  causal  models  have  been  developed,  it  seems  safest  to 
follow  Halpern  (1992)  in  supposing  that  sex  differences  in  test 
performance  may  reflect  a  variety  of  interacting  biological  and 
cultural  influences.” 

Although  initial  analyses  ( N  =  3,716)  found  zero  pre-admission  predictors 
strongly  related  to  hardiness,  subsequent  exploration  of  a  second  hardiness 
model  revealed  useful  observations.  First,  hardiness  model  2  revealed  that  less- 
educated  mothers  have  a  negative  effect  on  the  hardiness  of  cadet  football 
players.  This  is  consistent  with  the  results  from  hardiness  model  1,  which 
suggested  educated  mothers  have  a  positive  effect  on  total  hardiness.  It  is 
possible  that  a  mother’s  victory  over  academic  trials  improves  her  son’s  ability  to 
control  outcomes  and  stay  committed  in  the  face  of  adversity. 

Second,  we  see  that  liberal  political  views  contribute  little  to  total 
hardiness.  It  is  not  readily  apparent  as  to  why  liberal  football  players  have  lower 
hardiness  but  we  ascertain  these  cadets  lack  the  commitment  and  control  facets 
of  hardiness  needed  to  succeed  at  the  academy.  Further  investigation  revealed 
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the  tendency  of  liberals  to  be  higher  in  hardiness-challenge  (see  Figure  II)9.  It 
is  possible  that  the  negative  coefficient  estimate  is  associated  with  the  liberal/ 
hardiness-challenge  relationship. 
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Figure  1 1 .  Hardiness-challenge  versus  Political  Views 


Third,  the  model  suggests  that  the  hardiness  of  cadet  football  players 
educated  in  the  public  school  system  remains  unaffected  by  fathers’  degree 
level.  Since  the  majority  of  USMA  cadets  come  from  public  schools,  this  finding 


9  Dots:  outliers;  top  and  bottom  “T”:  Max/  min  values,  respectively;  top  and  bottom  of  boxes: 
upper/  lower  quartile  (25  per  cent  of  the  data);  dark  horizontal  line:  median  (50  percent  of  the 
data) 
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encourages  us  to  inspect  the  hardiness  of  cadets  from  other  high  school  types, 
namely,  private  religious  high  schools.  It  may  be  that  educated  fathers,  who 
mask  their  detachment  by  enrolling  their  sons  in  private  school,  produce  less 
hardy  USMA  cadets. 

Next,  we  see  from  Figure  6  that  USMAPS  attendees  with  conservative 
political  views  exhibit  lower  average  hardiness  while  politically  liberal  and 
moderate  USMAPS  attendees  tend  to  possess  greater  hardiness. 

Lastly,  we  comment  on  race.  The  summary  output  and  interaction  plot 
indicate  African  Americans  with  fathers  who  served  in  the  military  relate  most  to 
hardiness.  Surely,  there  are  societal  and  familial  reasons  why  African  Americans 
differ  from  Caucasians. 

Although  there  are  rules  and  regulations  regarding  the  selection  of 
candidates  based  on  race,  gender,  etc.,  we  recommend  additional  study  of 
hardiness  and  race,  in  combination  with  other  factors  (e.g.,  parents’  education 
level).  Specifically,  additional  minority  data  would  increase  their  sample  size  and 
perhaps  reveal  noteworthy  findings. 

B.  PREDICTION  OF  RETENTION  AT  USMA 

Results  from  the  first  retention  (graduation  versus  separation)  analysis 
show  that  academic  scores  (i.e. ,  APSC10,  PPSC,  and  MPSC)  relate  to  retention 
the  most.  Adequate  academic,  military  and  physical  performance  appears 
unattainable  for  most  of  the  separated  cadets.  It  suffices  to  say,  separated 
cadets  most  likely  encountered  academic,  military  or  physical  trouble. 

We  expected  a  hardiness  facet  to  show  a  strong  relationship  with  retention 
much  like  its  sister  personality  factor  grit.  However,  hardiness-challenge,  only 
moderately  effective,  emerged  as  the  only  significantly  related  (hardiness)  facet. 

10  We  mention  APSC  even  though  we  did  not  include  it  in  the  model.  Recall,  APSC  strongly 
correlated  with  MPSC  and  CEER.  It  does  not  take  away  the  fact  that  academic  performance  is 
important  to  retention  just  because  APSC  is  not  in  the  final  model.  Perhaps  Cadet  Performance 
Score  would  be  a  likely  alternative  to  use  in  place  of  MPSC,  APSC,  PPSC,  and  CEER. 
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The  relationship  between  hardiness-challenge,  academics  performance  and 
retention  seems  to  support  previous  research  from  Bartone  et  al.,  (2013),  who 
found  a  pattern  suggesting  cadets  high  in  CEER11  and  hardiness-challenge 
perform  worse  in  leadership  tasks.  We  can  conclude  that  the  result  of  poor 
(leadership)  performance  is  eventual  separation  from  the  Academy. 

The  absence  of  hardiness-commitment  from  the  final  model,  although 
surprising,  reveals  that  individuals  “committed  or  vigorously  engaged  to  work,  life, 
others  and  activities...”  possess  limited  potential  for  retention  when  academic 
proficiency  is  lacking.  This  may  indicate  that  hardiness  and  retention  relate  most 
when  those  who  possess  academic  ability  above  a  certain  threshold  need  an 
extra  boost  to  perform. 

By  definition,  hardiness-control  “...urges  a  person  to  persevere  so  that  his 
efforts  influence  events  and  outcomes”  (Maddi  et  al.,  2012).  Hardiness-control 
seems  to  pick  up  where  hardiness-commitment  leaves  off,  taking  a  proactive  and 
positive  response  to  adversity.  However,  the  absence  of  hardiness-control  in  the 
retention  models  reveals  the  power  of  tangible  factors  (i.e.,  academic  letter 
grades)  to  play  a  stronger  role  in  the  retention  outcome  of  USMA  cadets  over  the 
internally  driven  hardiness.  Perhaps  hardiness  becomes  apparent  in  small 
numbers  of  the  population  who  are  on  a  “thin  line”  of  failure  and  passing. 

The  presence  of  fathers’  degree  in  the  model  indicates  the  negative 
influence  on  retention  by  an  external  (to  the  individual  cadet)  factor.  We  cannot 
ascertain  why  fathers’  education  level  affects  retention  in  this  way;  future  study 
may  shed  light  on  this  phenomenon. 

Next,  we  identified  two  athletically  related  variables  associated  with 
retention.  First,  varsity  athletes,  among  the  most  task-saturated  of  cadets  at 
USMA,  undergo  various  testing  on  and  off  the  athletic  field.  Varsity  athletes  also 
travel  across  the  country  representing  West  Point  on  a  weekly  basis  (during  the 

11  Table  17  showed  APSC  highly  correlated  with  CEER 
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academic  year).  However,  it  is  not  clear  whether  the  negative  effect  from  athlete 
type  is  due  to  the  demands  of  the  increased  workload  experienced  during  the 
academic  year,  the  sport  type,  recruiting  challenges  (less  than  well-rounded 
cadets  who  are  strong  athletically  but  moderate  or  weak  academically),  or 
something  else. 

The  presence  of  PAE  in  the  model,  although  of  small  consequence,  is 
interestingly  one  of  indicators  obtained  by  pre-admission  testing.  However, 
without  further  study  we  cannot  accurately  link  PAE  to  retention. 

Lastly,  the  misclassification  rate  produced  by  cross-validation  indicates 
that  our  model  attributes  the  amount  of  variation  in  the  model  to  the  chosen 
predictors  78  percent  of  the  time. 

C.  U.S.  ARMY  RETENTION  AND  THE  USMA  GRADUATE 

The  retention  model  revealed  a  truth  familiar  to  most  post-GWOT  Army 
officers,  namely,  that  war  affects  infantry  officers  the  most.  That  infantry  officers 
are  the  first  the  raise  their  hand  and  depart  from  service,  is  perhaps  a 
misconception.  On  the  contrary,  GLM  4  indicates  the  Infantry  basic  branch 
positively  relates  to  active  duty  (retention). 

Reasons  why  Infantry  officers  remain  on  active  duty  longer  than  their 
peers,  although  not  investigated  in  the  present  work,  no  doubt  include  the  ability 
to  persevere  through  of  a  variety  of  work,  family,  and  stress-related 
circumstances  experienced  from  multiple  deployments.  Previous  work  (Britt  et 
al. ,  2001 )  supports  this  finding  and  if  we  examine  the  lifestyle  of  the  average  U.S. 
Army  Infantry  officer  “those  engaged  in  meaningful  work  during  the  deployment, 
derive  benefits  months  after...” 

The  Army  utilizes  the  Infantry  branch  more  frequently  than  any  other 
branch  in  war  situations.  In  fact,  the  other  Army  branches  exist  to  support  the 
Infantry.  Hence,  it  is  no  surprise  that  those  “engaged  in  work”  possess  greater 
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potential  for  retention  more  than  the  less  engaged.  Infantry’s  sister  “maneuver” 
branches  (i.e.,  military  police,  armor,  engineers12)  also  relate  positively  to 
retention,  thus  supporting  this  finding. 


Variable 

Estimate 

Std.  Error 

z  value 

Pr(>|z|) 

Branch  Area 

babrMP 

0.847 

0.444 

1.909 

0.056 

MFE 

babrEN 

0.557 

0.299 

1.862 

0.063 

MFE 

babrAR 

0.500 

0.290 

1.726 

0.084 

MFE 

Table  26.  Maneuver  Branches  Positively  Related  to  Retention 
Lastly,  the  misclassification  rate  indicates  our  model  predicts  the  correct 
retention  response  61  percent  of  the  time. 

D.  CONCLUSIONS  AND  FUTURE  WORK 

1.  Pre-admission  Predictors  of  Hardiness 

The  power  of  hardiness  to  predict  performance  across  multiple  contexts 
inspires  further  exploration  of  a  hardiness  predictor  among  pre-admission 
variables  and  across  various  groups.  The  hardiness  subset  of  football  players 
revealed  the  power  of  linear  regression  to  develop  a  hardiness  predictor  in  a 
target  population.  We  recommend  investigating  the  target  group(s)  hardiness  is 
best  predictive  for  and  why.  For  example,  is  hardiness  most  predictive  in  varsity 
athletes  or  non-varsity  athletes? 

Additionally,  the  significance  of  race  in  the  model  is  suggests  that 
hardiness  differs  by  race,  even  when  other  factors  are  held  constant.  We 
recommend  further  exploration  to  determine  which  factors  influence  hardiness  in 
each  race.  Results  may  influence  training  and  faculty  development  at  USMA. 


12  Although  the  armor  and  engineer  branches  were  not  significant  at  the  p<.05  level,  we 
included  them  in  Table  26  to  show  their  positive  coefficient  estimates  and  communicate  how 
close  their  p-values  came  to  0.05. 
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Lastly,  we  recommend  the  development  of  pre-admission  hardiness 
predictor  using  the  ideas  mentioned  above  as  a  contextual  framework.  We  could 
utilize  USMA’s  current  hardiness  battery  to  develop  additional  guideposts  that  aid 
the  USMA  admissions  committee  in  assessing  a  candidate’s  future  hardiness. 
From  the  battery,  we  can  identify  corresponding  activities,  scores,  etc.,  obtained 
during  high  school.  Either  the  hardiness  predictor,  along  with  other  metrics, 
might  comprise  a  revised  WCS  or  it  may  serve  as  a  stand-alone  “hardiness 
predictor”  score. 

2.  USMA  Predictors  of  U.S.  Army  Leader  Performance 

Measuring  true  leader  performance  is  often  confusing  and  subjective, 
leaving  detached  superiors  to  rate  their  subordinate  leaders.  The  U.S.  Army  is 
no  different.  The  2009  article  by  Bartone  et  al.,  investigated  the  most  predictive 
attributes  of  leader  performance  while  at  USMA.  We  extend  the  work  started  by 
these  researchers  by  encompassing  leader  performance  in  the  U.S.  Army. 
Leader  performance  at  USMA  should  be  a  stepping-stone  goal  to  performing  well 
as  a  military  officer.  We  can  also  tie  in  hardiness  to  discover  its  relationship  to 
officer  evaluations  and  promotion. 

3.  Demographics,  Parents’  Education  and  USMA  Retention 

Father’s  degree  relates  negatively  to  retention  when  the  level  attained  is 
high  school  but,  by  how  much?  The  coefficient  estimate,  /^.degree  (hs>  =  -0.296  is 
low  (less  than  1 .0).  Further  investigation  may  prove  beneficial,  especially  in  light 
of  the  relationship  between  father’s  degree  (high  school)  and  male  cadets. 
Additionally,  future  study  could  identify  the  effect  a  father’s  degree  (high  school) 
has  on  other  demographics  (i.e. ,  race)  to  discover,  for  example,  if  minority 
retention  differs  from  the  majority. 

4.  Retention  of  USMA  Graduates  in  the  U.S.  Army 

The  lack  in  predictive  power  of  the  retention  model  suggests  that  we  need 
additional  research.  Every  year  USMA’s  assessment  steering  committee  (ASC) 
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contacts  many  U.S.  Army  supervisors  and  raters  of  USMA  graduates.  ASC’s 
goal  is  to  determine  if  the  customer  (U.S.  Army)  is  satisfied  with  their  product 
(USMA  graduate/  second  lieutenant).  New  to  this  discussion  are  a  USMA 
graduate’s  reason(s)  for  leaving  active  federal  service  (i.e.,  whether  the  product 
is  satisfied  with  the  customer).  Retention  research  (Gjurich,  1999)  for  junior 
surface  warfare  officers  found  the  amount  of  time  spent  on  sea  duty,  the 
perceived  probability  of  finding  a  civilian  job,  satisfaction  with  pay  and 
allowances,  satisfaction  with  current  military  job,  satisfaction  with  job  training, 
and  satisfaction  with  working  conditions  as  most  influential  of  a  career  decision  to 
remain  in  service  beyond  their  initial  obligation.  For  U.S.  Army  officers,  we 
recommend  the  investigation  of  similar  factors,  namely,  the  ratio  of  deployment/ 
non-deployment  time,  perception  of  the  economy  and  probability  of  finding  a 
civilian  job,  satisfaction  with  pay,  job,  etc. 

Future  research  may  guide  the  development  and  issuance  of  a 
questionnaire  by  ASC  for  officers  released  from  active  federal  service.  The  goal 
is  to  identify  additional  predictors  of  retention — USMA  could  use  the  predictors  to 
enhance  curriculum,  policy,  selection  and  training. 

5.  MacArthur’s  Proclamation:  Athletes  and  Leadership 

Among  other  things,  we  remember  General  Douglas  MacArthur  for  his 
decision  to  place  special  emphasis  on  athletics  at  USMA.  He  believed  athletes 
made  the  best  leaders  because  sports  situations  mimicked  in  the  field  of  war. 
Future  research  should  include  the  role  of  athletics  at  USMA  and  beyond.  The 
study  would  seek  to  investigate  MacArthur’s  assertion  that  athletes  make  the 
best  leaders.  Because  all  USMA  cadets  participate  in  athletics,  the  study  would 
distinguish  between  intramural,  club  and  varsity  (intercollegiate)  athletes  and 
evaluate  their  U.S.  Army  military  performance  and  retention. 
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APPENDIX  A.  GLOSSARY 


Athletic  Activities  Score  (AAS):  A  score  reflecting  a  candidate's  athletic 
participation  awarded  in  accordance  with  guidelines  established  by  USMA 
admissions  department.  See  Appendix  B  for  additional  explanation. 

Captains  Career  Course  (CCC):  U.S.  Army  school  designed  for 
preparing  company  level  officers  to  command,  staff,  and  manage  operations  at 
the  operation  unit  level.  The  course  scope  includes  training  instruction  and 
practical  exercises  in  Army  operations,  professional  military  topics  in  common 
functional  areas,  unit  leadership,  doctrinal  base,  tactical  decision  making, 
maintenance  and  logistics,  and  military  writing  (USACAC,  2013). 

Community  Leadership  Score  (CLS):  A  score  composed  of  the  sum  of 
a  USMA  candidate’s  AAS,  EAS  and  FAS,  divided  by  three.  See  Appendix  B  for 
additional  explanation. 

Corps  Squad  Athlete:  West  Point  intercollegiate  sports  athlete 

Extracurricular  Activities  Score  (EAS):  A  score  reflecting  a  candidate's 
participation  in  activities  outside  required  school  curricula  awarded  in  accordance 
with  the  guidelines  established  by  USMA  Admissions  Department.  See 
Appendix  B  for  additional  explanation. 

Faculty  Appraisal  Score  (FAS):  The  average  candidate  scores  on  the 
school  official  evaluation  (SOE)  of  candidate  forms  (DD  Form  1869)  on  a  scale  of 
40/740.  See  Appendix  B  for  additional  explanation. 

Firstie:  A  member  of  West  Point’s  senior  class. 

Intermediate  Level  Education  (ILE):  The  purpose  of  the  Army's  ILE 
program  is  to  provide  all  mid-grade  officers  a  basic  foundation  of  professional 
military  education  and  leader  development  training.  It  develops  leaders  prepared 
to  execute  full  spectrum  operations;  trains  and  educates  leaders  in  the  practice 
and  values  of  the  profession  of  arms;  and  prepares  leaders  to  operate  in  joint, 
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multi-national  and  interagency  environments.  ILE  prepares  officers  for  duty  as 
field  grade  commanders  and  staff  officers  throughout  the  Army,  primarily  at 
brigade  and  higher  echelons  (HRC,  2013). 

Physical  Assessment  Exam  Score  (PAE):  A  score  achieved  by  a 
USMA  candidate  upon  successful  completion  of  the  basketball  throw,  pull-ups  (or 
flexed-arm  hang  for  women),  standing  long  jump  and  the  300-yard  shuttle  run. 
Recently  revised  and  replaced  by  the  candidate  fitness  assessment  (CFA),  a  pre¬ 
admission  assessment  also  used  by  the  U.S.  Naval  Academy  and  U.S.  Air  Force 
Academy. 

Plebe:  A  member  of  West  Point’s  freshman  class. 
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APPENDIX  B. 


USMA  ADMISSIONS  FILE  CALCULATIONS 


(Taken  from  Class  of  2000  WCS  Calculations  Sheet  and  from  a  USMA 
Admissions  document  entitled  “Annex  A:  Quantification  of  Candidate  File 
Components”) 

WHOLE  CANDIDATE  SCORE  (WCS) 

WCS:  (6  x  CEER)  +  (3  x  CLS)  +  (PAE  SCORE) 

COLLEGE  ENTRANCE  EQUIVALENCE  SCORE  (SCHOLASTIC  APTITUDE  TEST,  “SAT”) 
CEER:  (.364  x  HSR)  +  (.269  x  SATV)  +  (.432  x  SATM)  -  48 

COLLEGE  ENTRANCE  EQUIVALENCE  SCORE  (AMERICAN  COLLEGE  TEST,  “ACT”) 
ACEER:  (.21 9  x  HSR)  +  (9.43  x  ACTM)  +  (4.62  x  ACTE)  +  (0.45  x  ACTS)  +  (4.01  x  ACTR)  -  41 .5 

HIGH  SCHOOL  RANK  (HSR) 

HSR:  ((2  x  HS  STANDING)  -1)  /  (2  x  CLASS  SIZE); 

*HSR  TABLE  REQUIRED  TO  CONVERT  CALCULATED  RESULT  TO  HSR  SCORE 

COMMUNITY  LEADER  SCORE  (CLS) 

CLS:  (EX+  AT+  FAS)  /  3 

EXTRACURRICULAR  ACTIVITIES  SCORE  (EX):  A  score  reflecting  a  candidate's 
participation  in  activities  outside  required  school  curricula  awarded  in  accordance  with  the 
following  guidelines: 

800:  An  outstanding  young  person  with  quadruple  participation  or  honors  and  awards  on 
selected  extracurricular  activities  (each  worth  600  or  more  points). 

700: 

(1)  Student  Council  President; 

(2)  Triple  participation  or  honors  and  awards  in  selected  extracurricular  activities  (each 
worth  600  points); 

(3)  Participation  in  Boys/Girls  Nation; 

(4)  JROTC  Regimental/Brigade  Commander  or  Civil  Air  Patrol  Spaatz  Award  winner; 

(5)  Decoration  for  valor  [Soldiers); 

(6)  Ranger  or  Special  Forces  tab  [Soldiers]. 

600: 

(1)  High-school  Class  President; 

(2)  Editor-in-chief  of  a  school  publication; 

(3)  Participation  in  Boys/Girls  State,  President  of  National  Honor  Society,  or  recipient  of  a 
National  or  State  award; 

(4)  Eagle  Scout  (Boy  Scouts)  or  Gold  Award  (Girl  Scouts); 

(5)  Triple  participation  or  honors  and  awards  in  selected  extracurricular  activities  (each 
worth  500  points) 

(6)  Earhart /  Mitchell  Award; 

(7)  Combat  Infantryman  Badge;  Combat  Action  Badge;  Combat  Medical  Badge 
[Soldiers]; 

(8)  Soldier's  Medal  [Soldiers]; 

(9)  Soldier  of  the  Year-brigade-level  or  higher  [Soldiers]; 

(10)  Division-level  In-Service  Recruiting  Program  [Soldiers]. 
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500: 

(1)  Holder  of  one  or  more  elective  offices  in  moderately  selective  organizations; 

(2)  Participation  in  activities  or  recipient  of  awards  in  moderately  selective  organizations; 

(3)  Holder  of  a  private  pilot's  license; 

(4)  EMT/EMS  or  Volunteer  Firefighter; 

(5)  National  Honor  Society  VP/Treasurer  or  Secretary; 

(6)  Civil  Air  Patrol  officer/  1SG; 

(7)  Combat  veteran  of  three  or  more  months  in  theater  [Soldiers]; 

(8)  Expert  Infantryman  Badge  or  Expert  Field  Medical  Badge  [Soldiers]; 

(9)  Meritorious  Service  Medal  [Soldiers]; 

(10)  Distinguished  Honor  Graduate  of  Army  school  [Soldier]; 

(1 1 )  Soldier  of  the  Quarter — brigade-level  or  higher  [Soldiers]. 

400: 

(1)  Participation  in  activities  or  recipient  of  awards  in  organizations  with  limited  selectivity; 

(2)  Non-commissioned  officer  (Soldiers]; 

(3)  Squad  Leader  or  Platoon  Guide  [Soldiers]; 

(4)  90-day-plus  OCONUS  tour  [Soldiers]; 

(5)  Army  Commendation  Medal  [Soldiers]; 

(6)  Master  Fitness  Trainer  [Soldiers]; 

(7)  Honor  Graduate  of  an  Army  school  [Soldiers]; 

(8)  PLDC  graduate  [Soldiers]; 

(9)  BOSS  Representative  [Soldiers]. 

300: 

(1)  Some  participation  in  organized  activities; 

(2)  Army  Achievement  Medal  or  Good  Conduct  Medal  [Soldiers]. 

200:  No  participation  in  organized  activities. 

ATHLETIC  ACTIVITIES  SCORE  (AT):  A  score  reflecting  a  candidate's  athletic 
participation  awarded  in  accordance  with  the  following  guidelines: 

800:  An  outstanding  athlete  (All-American,  First  team  All-Area  selection  in 
baseball/softball,  basketball  or  football)  and  either  Athletic  rating  of  1  or  2  in  the  sport  in 
which  honors  are  received  or  CFA  score  >  650. 

700: 

(1)  First-team  All-Area  selection  in  a  single  sport  (other  than  baseball/softball,  basketball 
or  football); 

(2)  Captain  of  baseball/softball,  basketball,  or  football  team; 

(3)  Team  captain  in  two  or  more  sports  (other  than  baseball/softball,  basketball  or 
football)  for  class  size  over  100);  and 

(4)  Ranger  or  Special  Forces  tab  [Soldiers]. 

600: 

(1)  Captain  of  team  (other  than  baseball/softball,  basketball,  or  football); 

(2)  Varsity  letter  in  basebal  1/softbal  1 ,  basketball,  or  football;  and 

(3)  Varsity  letter  in  two  or  more  sports  (other  than  baseball/softball,  basketball,  or 
football). 

500: 

(1)  Varsity  letter  in  a  single  sport  (other  than  baseball/softball,  basketball,  or  football);  and 

(2)  Expert  Infantryman  Badge,  Expert  Field  Medical  Badge,  Jumpmaster,  or  Presidential 
Fitness  award  [Soldiers]. 
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400: 

(1)  Participation  in  a  varsity  sport  (no  letter); 

(2)  Graduate  of  Airborne,  Air  Assault,  Pathfinder,  or  comparable  other  _Army  school 
[Soldiers];  and 

(3)  Maximum  score  on  Army  Physical  Fitness  Test  [Soldiers]. 

300: 

(1)  Participation  in  junior-varsity  and  other  team  sports  (not  intramurals);  and 

(2)  Soldier  status. 

200:  No  participation  and  no  evidence  of  interest  in  sports. 

FACULTY  APPRAISAL  SCORE  (FAS):  The  average  of  the  candidate's  scores  on  the 
School  Official  Evaluation  (SOE)  of  Candidate  Forms  (DD  Form  1869)  on  a  scale  of  40 
740. 

NOTE:  The  information  above  contains  general  guidance  on  the  components  used  to 
compute  a  Community  Leader  Score  (CLS).  In  a  process  as  imprecise  as  leadership 
assessment,  subjective  judgment  must  be  applied  to  the  evaluation  process  in  order  to 
take  into  consideration  special  situations:  e.g.,  an  unusually  high  or  low  Faculty  Appraisal 
Score  (FAS)  that  is  inconsistent  with  other  elements  of  the  candidate  record;  athletic 
achievement  in  an  extremely  large  or  small  school  or  an  excellent  or  marginal  program; 
an  activity  record  that  may  not  fit  the  categorizations  of  the  Candidate  Activities  Record. 
The  Admissions  Office  and  the  Admissions  Committee  are  expected  to  make 
adjustments  in  the  components  of  the  CLS  to  take  into  account  such  situations. 

APS:  (.001926  x  HSR)  +  (.002283  x  SATM)  +  (.001421  x  SATV)  -  .6865 

HPA  NEW  SAT:  (.001070  x  SATM)  +  (.003462  x  SATV)  +  (.002035  x  HSR)  - 1 .390 

HPA  ACT:  (.001249  x  HSR)  +  (.04132  x  ACTE)  +  (.01087  x  ACTM)  +  (.02944  x  ACTSR)  -  .3257 

MSE  NEW  SAT:  (.004884  x  SATM)  -  (.000093  x  SATV)  +  (.002477  x  HRS)  -1.652 

MSE  ACT:  (.002004  x  HSR)  +  (.1487  x  ACT.M)  +  (.03713  x  ACTSR)  •  (.02022  x  ACTR)  -  (.06084 
xACTM(GT))- 2.2873 

RISK  LEVELS  AND  REQUIRED  CHECKS: 

SATV  <560 

SATM  <560 

ACTE  <23 

ACTM  <24 

ACTR  <24 

ACTS  <23 

CEER/ACEER  <520 
CLS  <450 

PAE  <420 

FAS  <525 

WCS  <5200 

HPA  <2.10 

MSE  <2.10 

APS  <2.15 

DEFINITIONS: 
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LEADER:  CLS > 650 
SCHOLAR:  CEER  OR  ACEER  >  650 

ESTIMATES: 

FAS  =  600 
P  AE  =  (AT  +  400)/2 
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APPENDIX  C.  ORIGINAL  VARIABLES 


Variable  Name  (42) 

Variable 

Type 

Levels 

Range 
(min, max) 

Hardiness 

Retention 

#1 

Retention 

#2 

1 

Gender  of  the  Cadet 

CN 

2 

★ 

2 

Racial/  Ethnic  Descent  category 

CN 

7 

★ 

3 

Parent  Graduated  USMA 

CN 

4 

★ 

4 

Parent  with  Service 

CN 

4 

★ 

5 

Father's  Career  f*] 

CN 

47 

★ 

★ 

★ 

6 

Mother's  Career  p  i 

CN 

47 

★ 

★ 

★ 

7 

Type  of  High  School  Attended  |  i 

CN 

8 

★ 

8 

Recruited  Athlete 

CN 

2 

★ 

9 

Political  Orientation 

CO 

6 

★ 

10 

USMA  Prep  School  Graduate 

CN 

2 

★ 

11 

College  Entrance  Equivalence  Rating 

Ql 

(407,  800) 

★ 

★ 

★ 

12 

Extracurricular  Activities  Score 

Ql 

(200,  800) 

★ 

★ 

★ 

13 

Athletic  Activities  Score 

Ql 

(200,  800) 

★ 

★ 

★ 

14 

Faculty  Appraisal  Score 

Ql 

(0,  740) 

★ 

★ 

★ 

15 

Physical  Aptitude  Exam  Score 

Ql 

(0,  800) 

★ 

★ 

★ 

16 

Total  Hardiness  Score 

N 

(0.53,  3.0) 

★ 

17 

Hardiness  Challenge 

N 

(0,  3.0) 

★ 

★ 

18 

Hardiness  Commitment 

N 

(0.6,  3.0) 

★ 

★ 

19 

Hardiness  Control 

N 

(0.4,  3.0) 

★ 

★ 

20 

Academic  Program  Score 

N 

(0.00,  4.22) 

★ 

★ 

21 

Military  Program  Score 

N 

(0.429,  4.188) 

★ 

★ 

22 

Physical  Program  Score 

N 

(0.00,  4.179) 

★ 

★ 

23 

Competitive  Club  Sport  [#] 

CN 

29 

★ 

★ 

24 

USMA  Status  Code 

CN 

2 

★ 

25 

Basic  Banch 

CN 

19 

★ 

26 

Active  Duty  Status 

CN 

2 

★ 

27 

Personal  Identification  Number  (Pin  ID) 

CN 

3716 

28 

Class  Admitted  to 

CN 

3 

29 

Class  Year  of  Record 

CN 

6 

30 

Graduation  Date  from  USMA 

CN 

19 

31 

Commissioning  Date  from  USMA 

CN 

19 

32 

Recruited  Athlete  Rating  Code 

CO 

6 

33 

Sport  Recruited  for 

CN 

21 

34 

Corps  Squad  Sport  [#][***] 

CN 

27 

★ 

★ 

★ 

35 

Whole  Candidate  Score 

Ql 

(4698,  7331) 

36 

Community  Leadership  Score 

Ql 

(418,  775) 

37 

Cadet  Performance  Score 

N 

(0,  3.9) 

38 

Academic  Quality  Point  Average 

N 

(.458,  4.198) 

39 

Date  of  Loss  to  USMA 

CN 

483 

40 

Date  of  Loss  Active  Duty 

CN 

279 

41 

Years  of  Service  as  of  31  JAN  201 3 

N 

(0,  7.80) 

42 

Rank  on  Active  Duty 

CO 

3 

[*]  Used  to  create  variables  f.degree  &  m. degree,  denoting  the  education  level  needed  for  parents'  career  type 
[**]  73  "NA"  values  were  imputed  "public,  the  average  type  of  high  school 

[***]  Used  to  creat  variable  playedsportl,  denoting  whether  a  cadet  played  an  intercollegiate  sport  or  not 


[#]  Used  to  create  the  variable  type.ath,  denoting  competitive  sport  type  (i.e.,  intramural,  club,  varsity) 

CN  Categorical,  Nominal 

CO  Categorical,  Ordinal 

Ql  Quantitative,  Integer 

N  Continuous,  Numeric 

Table  27.  Hardiness  and  Retention  Variables 


Explanations  of  the  variables  should  be  self-evident;  however,  we 

describe  extracurricular  activity,  faculty  appraisal,  athletic  activity,  physical 
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assessment  exam  and  community  leadership  scores  in  the  glossary  (Appendix 
A). 

OIR  also  released  variables,  not  included  above,  useful  for  identification 
purposes.  To  comply  with  human  subjects  research  restrictions,  we  used  the 
PIN  variable  to  assign  a  pseudo  name  to  each  data  entry.  This  prevented  the 
divulging  of  personally  identifying  information.  “Class  admitted  to”  and  “class 
year  of  record”  are  redundant  variables  previously  used  to  identify  year  groups. 
“Date  of  loss”  (to  USMA),  “commissioning  date”  (from  USMA),  and  “date  of  loss” 
(while  on  active  duty)  variables  indicated  when  a  cadet  departed  West  Point, 
commissioned  in  the  U.S.  Army,  and  departed  the  U.S.  Army  active  duty  service, 
respectively.  “Active  duty  status”  and  “rank”  (on  active  duty)  indicated  whether 
the  USMA  graduate  remained  in  the  Army  or  not,  and  their  current,  or  attained 
rank. 

A.  VARIABLES  NOT  USED  IN  THE  ANALYSIS: 

We  did  not  use  “class  year  of  record”  (not  shown)  in  this  analysis  due  to 
redundancy.  Few  entries  had  differing  years  for  “class  year  of  record”  and  “class 
admitted  to.” 

“Date  of  loss”  (to  USMA)  and  “date  of  loss”  (while  on  active  duty),  (neither 
shown)  were  not  used  in  this  analysis  due  to  redundancy  or  lack  of  need.  Dates 
were  not  of  particular  interest  to  this  project;  rather,  the  occurrence  of  the 
outcome  (i.e. ,  loss)  proved  beneficial.  USMA  “graduation  status”  (whether  a 
cadet  graduated  or  separated)  and  “active  duty  status”  deemed  sufficient  to 
address  the  retention  research  question. 

“Commissioning  date”  and  “graduation  date,”  (neither  shown)  did  not 
appear  to  differ  from  each  other  except  in  a  fraction  of  cases.  We  discarded  the 
noted  cases  from  the  final  data  set. 

Lastly,  we  did  not  use  “rank”  (not  shown)  in  the  analysis.  Rank  attained 
between  the  sixth  and  eighth  year  of  service  for  most  Army  officers  in  a  particular 
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year  group  remained  the  same  (captain).  There  are  exceptions  (e.g.,  officers 
branched  into  the  medical  field  to  serve  as  doctors  or  those  who  did  not  receive 
promotion  for  disciplinary  reasons).  We  show  the  initial  groupings  of  original 
variables  into  data  sets  below  (Table  30).  Appendix  D  contains  sample  images, 
in  spreadsheet  form,  of  the  data  sets. 


Group  1:  Hardiness  Predictor 

Outcome  (Dependent)  Variables 

Hardiness  -  Commitment 

Hardiness  -  Control 

Hardiness  -  Challenge 

Hardiness  -  Total 

Predictor  (Independent)  Variables 

Gender 

Race 

Father's  Career 

Mother's  Career 

Political  View 

Parent  USMA  Graduate 

Parent  Uniformed  Service 

Type  of  High  School  Attended 

Whole  Candidate  Score 

College  Entrance  Equivalency  Rating 

Physical  Assessment  Exam  Score 

Community  Leadership  Score 

Extracurricular  Activity  Score 

Athletic  Activity  Score 

Faculty  Appraisal  Score 

USMA  Prep  School  Attendee 

Recruited  Athlete 

Sport  Recruited  For 

Recruited  Player  Rating 

Group  2:  Retention 

Outcome  (Dependent)  Variables 

USMA  Graduation  Status 

Years  of  Services  (U.S.  Army  Officer) 

Predictor  (Independent)  Variables 

Hardiness  -  Commitment 

Hardiness  —  Control 

Hardiness  -  Challenge 

Hardiness  —  Total 

Cadet  Academic  Quality  Point  Average 

Cumulative  Academic  Program  Score 

Cumulative  Military  Program  Score 

Cumulative  Physical  Program  Score 

Cumulative  Cadet  Performance  Score 

Corps  Squad  Sport  Played 

Club  Squad  Sport  Played 

Basic  Branch 

Identifier  Variables: 

Personal  Identification  (PIN) 

Class  admitted  to 

Active  Duty  Status 

Table  28.  Original  42  Variables 
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APPENDIX  D 


SAMPLE  DATASETS 


SAMPLE  HARDINESS  DATA  SET 
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Table  29.  Hardiness  Data  Set 
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B.  PARENTS’  DEGREE  BREAKDOWN  BY  CAREER 


HighSchool 

College 

GradSchool 

Homemaker 

Teacher-elementary 

College  teacher 

Other 

Teacher-secondary 

Business  executive 

Undecided 

Business  owner 

Therapist 

Unemployed 

Nurse 

Physician 

Farmer/Rancher 

School  counselor 

Clergy 

Business  clerk 

Social  worker 

Policy/Govt 

Semi-skilled 

Business  sales 

Optometrist 

Law  enforcement 

Accountant 

Dentist 

Skilled  trades 

Military  science 

Psychologist 

Laborer 

Programmer 

Veterinarian 

Lab  technician 

Pharmacist 

Science  researcher 

Interior  Decorator 

Dietitian 

Other  religious 

College  admin 

Lawyer 

Artist 

Writer 

Actor 

School  principal 

Engineer 

Architect 

Musician 

Foreign  service 

Conservationist 

Table  30.  Parents’  Degree  Breakdown  by  Career  Type 
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C.  SAMPLE  RETENTION  DATA  SET,  GRADUATION  VERSUS 
SEPARATION 
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2.27 
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2.225 

2.602 

1.968 

1 
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2005 
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Var 
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2.6 

2.2 

2.33 

2.694 

2.64 

3.103 

3.44 

2.752 
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2.2 

2.53 

3.286 

3.309 

2.945 

2.587 

2.938 
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C00088100 

2005 

0 

0 

IM 

2.2 

2 

1.8 

2 

3.077 

3.064 

2.475 

3.194 

2.709 

1 

C00138788 

2006 

0 

0 

IM 

1.8 

1.8 

1.6 

1.73 

3.378 

3.391 

2.869 

3.133 

3 

1 

C00153993 

2005 

0 

0 

IM 

2.2 

1.4 

2.8 

2.13 

2.601 

2.58 

2.47 

2.48 

2.29 

1 

C00195017 

2006 

0 

0 

IM 

1.8 

2.2 
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2.195 

2.158 

2.654 

2.688 

2.153 
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C00202724 
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0 
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1.6 

1.93 
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2.72 
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1 

Club 

1.6 
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2 

2.749 
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2.716 

2.78 

2.466 
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C00271310 

2005 

1 
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Var 

2.4 
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1.8 

2.07 

2.814 

2.798 

2.327 

3.352 

2.545 

1 

C00272553 

2007 

0 

0 

IM 

2.4 
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2.53 

2.934 

2.898 

2.941 

3.374 

2.77 

1 

C00308170 

2005 

1 

0 

Var 

2 

2 

1 

1.67 

3.748 

3.773 

2.901 

3.061 

3.218 

1 

C00325548 

2006 

0 

0 

IM 

2.2 

2.2 

2.2 

2.2 

2.892 

2.888 

2.672 

2.786 

2.585 

1 

C00378676 

2006 

0 

0 

IM 

1.8 

2.2 

0.8 

1.6 

2.835 

2.833 

1.754 

1.884 

2.069 

0 

C00392880 

2007 

0 

0 

IM 

2 

2 

1.8 

1.93 

2.639 

2.615 

2.621 

3.154 

2.443 

1 

C00433896 

2007 

1 

0 

Var 

2.8 

2.8 

1.6 

2.4 

3.418 

3.419 

2.766 

3.008 

2.935 

1 

Table  31 .  Retention  Data  Set  1 ,  Graduation  versus  Separation  Status 
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D.  SAMPLE  RETENTION  DATA  SET,  ACTIVE  DUTY  VERSUS  LOSS 
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2.566 
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AD 
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2005 
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AD 
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Table  32.  Retention  Data  Set  2,  Active  Duty  versus  Loss  Status 
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APPENDIX  E.  SIMPLE  LINEAR  REGRESSION  APPROACH 

EXPLAINED 


A.  HARDINESS  AND  SIMPLE  LINEAR  REGRESSION 

Mathematical  formulation  came  from  Kutner,  Nachtsheim,  Neter,  &  Li 

(2005). 

After  an  examining  the  data,  we  used  regression  analysis  to  explain  the 
relationship  between  the  chosen  predictors  and  outcome  (also  called  criterion  or 
response )  variables.  Generally,  the  outcome  variable  we  wish  to  predict, 
denoted  Y,  is  the  dependent  variable.  The  predictor,  denoted  X,  is  referred  to  as 
the  independent  variable.  We  relate  Y  and  X  as  follows: 

Y  =  KX)  (3) 

Outcome  and  predictor  variables  never  relate  perfectly,  but  the  aim  is  to 
find  the  tendency  of  an  outcome  for  Y  to  vary  with  the  predictor  variable  X.  We 
see  the  imperfect  relationship  by  plotting  the  two  variables  in  a  space  containing 
an  X-axis  and  Y-axis  and  drawing  a  best-fit  line.  On  either  side  of  the  line  are 
scatter  points.  The  distance  from  each  point  to  the  fitted  line  is  called  the  error 
term  Epsilon  (“s”).  e  accounts  for  randomness  and  captures  the  variability  in  the 
outcome  unexplained  by  the  predictors  (Montgomery  et  al.,  2001).  The 
introduction  of  error  changes  our  function  to  the  following: 

Y  =  f{X)  +  s  (4) 

Additionally,  there  is  a  probability  distribution  of  Y  for  each  level  of  X.  A 
probability  distribution  assigned  a  probability  to  each  outcome.  Often,  regression 
models  require  more  than  one  predictor.  The  addition  of  other  predictors 
changes  our  best-fit  line  to  a  best-fit  surface  (e.g.,  a  planar  surface  with  two 
predictors,  Xi  and  X2).  The  functional  equation  then  becomes: 

Y  =  f(Xl,X2,...,Xn)  +  s 


87 


One  of  the  challenges  in  designing  a  regression  model  is  to  determine 
which  predictors  Xb  i=1,...,n  are  to  be  used  and  which  should  be  discarded.  A 
second  challenge  is  finding  an  appropriate  functional  form  of  the  model.  Two 
common  functional  forms  are  linear  and  quadratic  models.  The  most  common 
form  is  a  regression  model  in  which  the  regression  relation  is  linear.  The  general 
form  follows: 

•  One  variable 

f(X)  =  ft0+fi1Xlt 

Y  =  f(X)  +  e,  (6) 


•  Two  or  more  variables 


/(*>  ,x2,...,x,)  =  + PxXx.  +  p2x2i  +  ...+ pxni 

Y  =  f(Xl,X2,...,Xn)  +  e, 

Yi=  00+  P\Xl  i  +  PlX2  i  +-  +  PnXni  +  Si 


where: 

•  Yj  is  the  regression  function’s  outcome  at  the  /h  trial  (for  the  /th  participant) 

•  jS o  is  the  Y  intercept  of  the  regression  line 

•  /3 1  is  the  slope  of  the  regression  line 

•  Xj  is  the  predictor  value  at  the  /th  trial  (for  the  /'th  participant) 

•  Xni  is  the  nth  predictor  value  at  the  /th  trial  (for  the  /h  participant) 

•  £/  is  a  random  error  term  is  mean  E{  z-,  }=0  and  variance  o2{  z-,  }=  a2 

•  Note:  p‘s  are  also  called  regression  coefficients  or  parameters 

•  X0,  although  not  shown  is  equal  to  1  and  is  paired  with  /30;  /3oX0=  jS0(1  )=  Po 
When  there  is  only  one  predictor  and  the  regression  coefficients  and 

predictors  are  linear  (non-exponential,  non-multiplicative,  etc.),  it  is  a  simple 
linear  regression  model. 
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We  may  show  all  /  (1  through  n)  observations  of  our  regression  model 
using  the  following  matrix  representation: 

Yi=Xpi  +  si,i  =  \...n  (8) 

where  Y,  p,  and  e  are  vectors  of  responses,  parameters,  and  normal,  random 
errors,  respectively.  X  is  a  matrix  of  constants. 


X 

"l 
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..  ^ 
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*„2- 
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>  • 

1 _ 

A. 

nx  1 

X 

nx  p 

P 

£ 

For  visualization  purposes,  we  substitute  the  actual  (text)  names  for  the  real 
response  and  predictors  for  the  hardiness  (“H”  below)  model  in  equation  (10). 
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2, Gender 
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2,Phys.Aptd .  Score 

/  1 Gender 
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s2 
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1 

v 

v  n,  Gender 

X 

v  n,Race 

•  X 

v  n,Phys.Aptd.  Score 

P Phys.Aptd. Score 

£n_ 

nx  1  nxp  px  1  nx 1 


(10) 


Reverting  to  the  previous  notation  using  Xs  and  Ys,  we  use  matrix 
multiplication  and  addition  to  obtain  a  linear  system  of  equations. 


yP 

‘l 

*11 

*12- 

"A 
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*21 

*22- 
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s2 

Yn_ 
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*,2- 
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1 - 
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nx  1  nxp  px 1  nx  1 


Pa  +  P\X  11  +  •  •  •  +  P p-\X\p_\ 

V 

+ 

£2 

Pa  +  P\Xn  1  + '  •  ‘  +  Pp-PXn  p_x  \ 

nx  1 

_£n_ 

(11b) 
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A + P\x\  i  +  •  •  • + Pp- \x\,p~\ + £\ 

&  +  P\Xn\  +  •  • '  +  Pp-\Xn,p-\  +  _ 

nxl  (11c) 

We  see  multiple  versions  of  equation  (7)  as  the  system  of  equations  in 
matrix  form  (equation  11a,  b,  and  c).  The  known  values  in  most  regressions  are 
the  Ys  and  the  Xs.  One  of  the  key  points  of  regression  is  estimating  the  beta 
values  in  such  a  way  that  the  right  hand  side  equation  (Xs,  beta’s,  error)  is  close 
to  the  left  hand  side  (Ys).  The  most  popular  way  of  finding  good  estimates  is  by 
using  the  ordinary  least  squares  method. 

B.  ORDINARY  LEAST  SQUARES 

Simple  linear  regression  (SLR)  finds  the  optimal  solution  to  the  value  of 
the  regression  coefficients  (/3)  that  minimizes  the  sum  of  squares  error  for  a  given 
set  of  Xs  and  Ys  using  the  ordinary  least  squares  (OLS)  method.  OLS  starts  with 
an  equation  and  impose  a  restriction  that  the  sum  of  errors  is  zero  between  the 
actual  response  and  the  expected  value  of  the  response,  denoted  E[Y ,]. 

E[Yi]  =  P0+plXli+...  +  PnXni  (12) 

With  the  errors  equal  to  zero,  take  the  difference  between  the  actual 
response  and  the  expected  value  and  solve  for  the  beta  values. 

Y^EiY^Y'-EiY^  0 

Yi-(P0+PlXli+...  +  PiiXiii)  =  0 

Y-0 ''0-frXu-...-PnXni=0  (13) 

Furthermore,  we  are  concerned  with  absolute  differences,  so  we 
disregarded  negative  signs  by  squaring  the  difference.  Lastly,  to  ensure  the 
appropriate  beta  values  are  found,  sum  the  differences  of  the  entire  range  of 
(X/,Y)  so  as  to  minimize  the  overall  squares. 
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“  (14) 

OLS  does  not  find  the  exact  / \  ( k=0...n )  values,  but  rather  estimates,  b0, 
bi,...,bk  that  minimize  the  criterion  Q  for  every  (X,,Y,)  pair.  OLS  is  completed 
systematically  using  computer  software.  We  can  derive  /3  estimates  from 
equation  (14)  using  calculus,  differentiating  with  respect  to  each  value  of  /3. 


oQ_ 

dPo 


■2 


Substituting  bk  in  for  /3  and  set  equal  to  zero,  we  get 


-2YJ(Yi-b0-blXli-...-bnXni)  =  0 


and,  after  simplifying, 


YJ(Yi-b0-blXv-...-bnXni)=0 

/= 1 


1=1  1=1  1=1 


E'vv„  =» 


Er-«*o-»,Ex„ 


1=1 


i=l 


-...-6  Vi .  =0 

/?  /  ./  nz 


(15) 


(16a) 


(16b) 

(16c) 


(16d) 


Equation  (16d)  made  up  one  part  of  the  normal  equations.  The  form  is 
unique  for  the  first  point  estimate,  b0.  For  the  second  (and  following)  normal 
equations,  we  consider  the  remaining  point  estimates  ( b-i ,...,  bn) 


dQ 

dbx 


-2 


1=1 


(17a) 


91 


(17b) 


(17c) 


X  xv 

m  1 1 

(17d) 

Equation  (17d)  is  the  second  normal  equation.  We  express  the  normal 
equations  in  matrix  notation  as  follows: 

n 

In 

i= 1 

±X*Yt  _  nb0+  bl'LXli  + ...  +bn'ZXni 
W  :  ,  blTXu2+...+bn'ZXniXu_ 

’='  -I  (18a) 

Equation  (18a)  becomes 


po 

"  Z*u  -  I**  1  ^ 
x,  ZX1;2...ZX,,X1J  ; 

hn_ 

(18b) 


Z  xuYt  =  b0  Z  +  ■ b,  z  *u2  +•••+*.  Z 

z‘=l  1=1  1=1  Z=1 


or,  equivalently 


X’Y  =  X’X  b 
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(19) 


where  b  is  the  vector  of  least  squares  regression  coefficients,  bo...bn.  To  obtain 
the  estimates  for  the  regression  coefficients,  we  use  matrix  multiplication  on 
equation  (19). 

X’Y  =  X’X  b 

(X’X)'1  X’Y  =  (X’X)'1  X’X  b 
Since  (X’X)'1  X’X  =  I  and  lb  =  b: 

(X’X)'1  X’Y  =  b  (20) 

A  final  step  of  this  phase  is  to  use  the  estimated  regression  coefficients 

/\ 

(bk)  to  find  the  estimate  of  the  regression  function  ( Yj ,  “Y-hat”).  This  process 

results  in  what  statisticians  call  the  “fitted  model.”  The  fitted  model  is  no  more 
than  the  original  model  with  the  substituted  beta  estimates  (bk).  Before  this  is 
accomplished,  we  summarize: 

(a)  Original  regression  model  form: 


Yi=E[Yi]  +  8i 


where  ,  in  general 


(21a) 


(21b) 


(b)  After  obtaining  estimates  for  /3k ,  equation  (20),  the  fitted  regression 
model  form  is: 


Yi  -b0+  blXu  +  b2X2i  +  ...  +  bnXnj  ^2 ) 

/\ 

(c)  In  order  to  obtain  a  specific  value  (T))  of  the  estimated  regression 

function  at  the  level  X,  of  the  predictor  variable,  a  substitution  of  bk’ s  into  equation 
(22)  is  made.  Equation  (22)  will  not  give  us  a  perfect  fit,  but  will  get  us  as  close 
as  we  can  to  the  actual  value,  given  the  power  of  our  predictors.  The  difference 
between  the  original  Y  and  its  estimate,  Y-hat,  is  the  subject  of  the  next  section 
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C.  ERROR  TERMS  AND  RESIDUALS 

Until  now,  we  have  not  discussed  the  error  term,  epsilon  (£,).  Using 
equation  (19),  we  define  the  general  error  of  the  model  as: 


e,  =  Yt-E[Yt] 


(23) 


Equation  (23)  implies  that  we  must  know  the  true  (expected)  value,  but  the 

true  value  is  unknown.  However,  we  know  the  value  of  its  estimate,  namely,  the 

/\ 

fitted  value  ( Yf ).  We  substitute  the  estimate  for  the  expected  value  and  find  the 
deviation.  This  difference  is  termed  “residual”  (e.)  and  defined  as: 


ei=Y,-Yt 


(24) 


Residuals  help  determine  the  appropriateness  of  a  particular  regression 
model.  Earlier,  we  forced  the  residual  to  be  zero  in  developing  the  normal 
equations.  This  ensured  we  were  able  to  find  estimated  values  of  the  regression 
coefficients  that  minimized  the  sum  of  squares.  Another  form  of  equation  (14)  is: 


e=Z(r-£K])!=i(T=o 

=1 


i= 1 


(25a) 

When  Y-hat  is  substituted  in  for  E[ Y]  and  e,  is  substituted  for  for  £/,  we  get: 


(25b) 


i= 1  i=l 

This  is  the  same  thing  as  saying: 

ZW’-o 


The  calculation  found  in  equation  (25a  and  25b)  yields  the  error  (or 
“residual”)  sum  of  squares.  Realistically,  the  error  terms  vary  in  for  each  (X,,  Yi) 
pair.  The  goal  is  to  ascertain  the  amount  of  variability  closer  to  zero.  In  order  to 
get  an  idea  of  the  variability  of  the  probability  distribution  of  Y,  we  must  estimate 
the  average  variance  of  the  error  terms,  a2.  Much  like  E[Y],  o2  is  unknown  but 
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we  can  obtain  its  unbiased  estimator  using  the  error  sum  of  squares  (SSE) 
(equation  25b)  and  dividing  by  the  degrees  of  freedom  ( n-2 ).  We  call  this 
estimator  the  mean  square  error  (MSE). 


2 

5  = 


SSE 

n-2 


n  n 

IM  i(e,)2 

i=i  _  i=i 


n-2 


n-2 


=  MSE 


(26) 


At  times,  we  are  interested  in  the  deviation  of  the  response  (1^)  from  the 
average  (Fj  “Y-bar”)-  We  further  defined  this  as  the  measure  of  uncertainty  in 
predicting  the  outcome  after  accounting  for  the  predictors.  Similar  to  equation 
(24),  we  calculate  it  using  sum  of  squares  and  called  the  sum  of  squares  total 
(SSTO). 


'Z(Yi-Yf=SSTO 

(27) 

Notice,  SSTO  and  SSE  help  us  find  the  formula  for  measuring  the 
variability  of  the  Y,  associated  with  the  regression  line  without  taking  into  account 
any  predictor  variables.  We  call  this  sum  of  squares  regression  (SSR)  and  use 
the  following  formula: 


SSR  =  SSTO  -  SSE 

i= 1  i=\ 

=z«-F>! 

i=\ 


(28) 


Finally,  we  use  SSTO,  SSE,  and  SSR  to  measure  the  effect  of  X  in 
reducing  the  variation  in  Y  when  expressed  as  a  ratio,  called  R-squared. 


D.  MODEL  COMPARISON  USING  R2 

SLR  analysis  provided  an  initial  basis  to  compare  models  called  the  R2  (R- 
squared)  or  the  coefficient  of  multiple  determination.  We  interpret  R2  as  the 
proportion  of  variation  in  the  response  ( Y)  explained  by  use  of  the  set  of  variables 
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Xi, . . Xp-i,  p=  (number  of  parameters).  R2  falls  between  zero  and  one,  assuming 
the  value  0  when  all  bk=  0  and  the  value  1  when  all  Y  observations  fall  directly  on 
the  fitted  regression  surface  (e=0).  We  represent  R2  as: 


R2_  SSR  sse 

~  SSTO  ~  SSTO  (29a) 

In  the  event  we  add  more  variables  to  our  model,  we  will  experience  an 
increase  in  SSR  and  SSE,  thus  increasing  our  R2.  However,  to  avoid  erroneous 
inflation  of  the  coefficient  of  multiple  determinations  by  adding  more  variables, 
most  statistical  textbooks  recommend  using  an  adjusted  R2  or  “ R2a ”.  We 
calculate  R-squared-adjusted  by  dividing  each  sum  of  squares  by  its  associated 
degrees  of  freedom  as  follows. 


SSE 

R2  -1  n~P 
a  SSTO 

n  - 1 


SSE 

,  n  -  p  J  SSTO 


(29b) 


E.  INFLUENCE  OF  VARIABLES  IN  THE  PRESENCE  OF  OTHERS 

Multiple  variables  may  describe  the  same  characteristic  or  influence  in  a 
response,  often  without  the  researcher  initially  knowing  it.  It  is  possible  the 
redundancy  of  two  or  more  predictors  do  more  harm  than  good  by  inefficiently 
assigning  variance  in  the  response.  Regression  analysis  defines  multicollinearity 
as  correlation  between  predictor  variables.  Several  effects  of  multicollinearity  are 
variation  in  estimated  regression  coefficients  as  sample  populations  change  and 
unsatisfactory  regression  fit.  In  addition,  a  lower  R2  may  occur  as  two  or  more 
predictor  variables  relate  to  each  other. 

Pairwise  coefficients  of  simple  correlation  or  Variance  Inflation  Factors 
(VIF)  help  diagnose  multicollinearity.  We  calculate  VIF  using  the  following 
formula: 

(VlF)k={\-Rly\k  =  \,2,...,p-\  nm 
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APPENDIX  F.  LOGISTIC  REGRESSION  APPROACH 

EXPLAINED 


Note:  Mathematical  formulation  garnered  from  Montgomery,  Peck,  & 
Vining  (2001). 

Logistic  Regression  was  used  when  the  outcome  Y  was  categorical  with 
two  possible  outcomes,  either  “1”  or  “0”  (1=“YES,”  0=  “NO”,  or  vice  versa).  Y,  is 
considered  a  Binomial  {nh  nl)  random  variable  with  777  defined  as  the  probability 
that  Y=1  and  1-  777  as  the  probability  that  Y,=0. 


E[Y,  ]  =  1 X  pr(Yi  =  1)  +  0  X  pr(Yt  =  0) 

=  1  X  (k.)  +0x  (1  —  7T; ) 

=  /T, 

The  linear  predictor  (/],)  of  the  expected  value  (777)  takes  the  form 


(31) 


Vi  ~  A)  +  A  X\i  +-  +  PnXn 


(32a) 


We  use  the  following  transformation  to  link  rr,  to  the  linear  predictor  77,. 


Logistic  Link  Function: 


g(n)  =  In 


f  n  ^ 


\\~K  j 


=  rj 


\J\le  solved  using  algebra  to  get 


(32b) 


exP(/?0  +P\X\  +•  ■■  +  Pp-\Xp~\)  _  exp  (77,.) 
l  +  exp(/?0  +  fixXx  + ...  +  fip_xX p_x)  1  +  cxp(?7;)  ^33g^ 

Equation  (33a)  has  the  same  form  as  the  logistic  cumulative  distribution 
function.  The  logistic  distribution  is  very  similar  to  the  normal  distribution  and  has 
the  nice  properties  of  mean  equal  to  zero  and  standard  deviation  a  =  tt/V3. 

We  use  the  method  of  maximum  likelihood  to  estimate  the  regression 
coefficients,  denoted  p  (“Beta-hat”).  These  values,  found  oftentimes  through 
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numerical  techniques  or  iteratively  reweighted  least  squares,  may  then  be 
substituted  into  equation  (33a)  and  reveal  the  following  fitted  logistic  regression 
model: 

r.  exP(  A>  +  A  -^  +  •  •  •  +  ~Pp  i x P-\ )  _  exp(jy.)  _  y. 

1  +  exp {j30+j3lXl  +  ...+j3p_lXp_x)  l  +  exp(/7,.)  '  (33b) 

To  summarize,  we  started  with  the  link  function,  equation  (32),  used  the 
maximum  likelihood  estimators  (“/?”)  and  substituted  the  result  into  equation 
(33).  For  given  values  of  the  predictor(s)  X/,  we  estimated  the  probability  of  the 
response  777.  If  that  probability  is  close  to  “1 ,”  we  associated  the  response  with  a 
“YES,”  and  if  the  probability  is  closer  to  “0”  than  it  is  to  “1,”  then  we  associated 
the  response  with  “NO.” 
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APPENDIX  G.  “R”  COMPUTER  CODE  AND  PLOTS 


A.  LINEAR  MODEL  R  CODE 


HP4=read.csv(file.choose()) 

HP4cor1=data.frame(HP4[c(9,10,1 1 ,12,13,14,15,22)]) 
cor(HP4cor1 ) 

cor(HP4$recath2,HP4$playedsport1 ) 

#.86  correlation  between  the  two.  confirm  with  model  matrix  below 

h4=HP4[-c(1 ,2,3,9,12)]  #take  out  cols  1 :3,  WCS  and  CLS 

#LM1 

h.lml  =lm(hrdns2~.,data=h4) 
hcor2=cor(model.matrix(h.lm1 )) 
diag(hcor2)=0 
which(hcor2>abs(.5)) 

hcor2[282];hcor2[315]  #confirmed  cor  btwn  recath  and  playedsport 

#update  data  frame  minus  recath 
h4.new=HP4[-c(1 ,2,3,7,9,12)] 
h.lm2=lm(hrdns2~.,data=h4.new) 
summary(h.lm1 ) 
summary(h.lm2) 

#stepwise  with  h.lm2 

h.step1=step(h.lm2,direction="both",trace=FALSE) 
summary(h.step1 ) 

#Plot  of  [possibly]  most  sig  predictors 

pairs(~hrdns2+ceer+pae+eas+aas+fas+gender+playedsport1+m.degree+typehs 

,data=h4.new,cex.labels=2) 

#interactions 

h.lm3=lm(hrdns2~.A2,data=h4.new);  summary(h.lm3) 

h.step2=step(h.lm2,scope=~.A2,direction="both",trace=FALSE); 

summary(h.step2) 

#ANOVA  between  stepwise  models 
anova(h.step1  ,h.step2,test="Chi") 

#correlations 

hcor3=cor(model.matrix(h.step2)) 


99 


diag(hcor3)=0 

corstr=c(which(hcor3>abs(.5))) 

uni=unique(hcor3[corstr]) 

#  diagnostics 
par(mfrow=c(2,2)) 
plot(h.lm2) 

plot(h.step2,cex.labels=2) 

#Kruskal  Wallis 

kruskal.test(hrdns2~gender,data=h4.new) 
kruskal.test(hrdns2~playedsport1  ,data=h4.new) 
kruskal.test(hrdns2~typehs,data=h4.new) 
kruskal.test(hrdns2~f. degree, data=h4.  new) 
names(kruskal.test(hrdns2~m. degree, data=h4. new)) 

#explore  gender 

plot(h4.new$gender,h4.new$hrdns2,ylab="Total  Hardiness") 
title(main="Gender  versus  Hardiness", ylab="Total  Hardiness") 

stripchart(h4.new$hrdns2  ~  h4.new$gender,  vertical=TRUE,  method="jitter", 
pch=16,  col="red",ylab="Total  Hardiness") 
yhat<-tapply(fitted(h.step2),h4.new$gender,mean) 
for(i  in  1  :length(yhat)){ 
lines(c(i-.2,i+.2),rep(yhat[i],2)) 

} 

title(main="Hardiness  by  Gender") 

#h4.new2=h4.new[-685,] 

#hard=table(h4.new$hrd,h4.new$gender) 

#cbind  (hard,  round  (100  *  hard[,1]  /  rowSums  (hard),  1), round  (100  *  hard[,2]  / 
rowSums  (hard),  1)) 

#  Tukey  Test  (anova)  for  Gender 
a1=aov(hrdns2~gender,data=h4.new) 

TukeyHSD(a1  );plot(TukeyHSD(a1 )) 

#explore  typehs 

hard2=table(h4.new$hrdns2,h4.new$type) 

sum(hard2[,1]) 

#additional  plots  of  Hardiness  by  Cat  variable 

#  playedsportl 

stripchart(h4.new$hrdns2~h4.new$playedsport1 ,  vertical=TRUE,  method-'jitter", 
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pch=16,  col="red",ylab="Total  Hardiness") 
yhat<-tapply(fitted(h.step2),h4.new$playedsport1  .mean) 
for(i  in  1  :length(yhat)){ 
lines(c(i-.2,i+.2),rep(yhat[i],2)) 

} 

title(main="Hardiness  by  Varsity  Sport  Player") 

#typehs 

stripchart(h4.new$hrdns2  ~  h4.new$typehs,  vertical=TRUE,  method-'jitter", 
pch=16,  col="red",ylab="Total  Hardiness") 
yhat<-tapply(fitted(h.step2),h4.new$typehs,mean) 
for(i  in  1  :length(yhat)){ 
lines(c(i-.2,i+.2),rep(yhat[i],2)) 

} 

title(main="Hardiness  by  Type  of  High  School") 
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HO  500  TO  3 X 


B.  LINEAR  MODEL  PAIRS  PLOT  OF  MOST  SIGNIFICANT  PREDICTORS 


Figure  12.  Pairs  Plot  of  Most  Significant  Predictors  of  Hardiness 
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CO  500  300  TO  900 


VlStandardzed  residual  si  Residuals 


Residuals  vs  Fitted 


Normal  Q-Q 


Rtted  values 


Theoretical  Qjantiles 


Rtted  values 


Leverage 


Figure  13.  Linear  Model  Diagnostics 
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Total  Hardi 


Hardiness  by  Varsity  Sport  Player 


Figure  14.  Hardiness  versus  Intercollegiate  Athlete  (1=”Yes”) 
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Hardiness  by  Type  of  Hgh  School 


Figure  15.  Hardiness  versus  Type  of  High  School 


C.  LOGISTIC  REGRESSION  MODEL  R  CODE 


***  RTN  #2  ***  ACTIVE  DUTY  VS.  LOSS 

RTN=read.csv(file.choose()) 
names(RTN) 
rtn=RTN[-c(1 ,2,3)] 
names(rtn) 

pairs(status~babr+ceer+pae+eas+aas+fas+m.degree+f.degree+cm2+co2+ch2+a 

psc+mpsc+ppsc+typ.ath,data=rtn) 

#1.  GLM1 

rtn.glm1=glm(status~.,family=binomial,data=rtn) 
summary(rtn.glm1 ) 
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#  ANOVA  for  glml  versus  step 
anova(rtn. glml  .step.rtnl  ,test="Chi") 

#GLM2  clog-log 

rtn.glm2=update(rtn.glm1  ,family=binomial(link="cloglog")) 
summary(rtn.glm2) 

#  ANOVA  for  logit  versus  clog-log 

anova(step.rtn1  ,rtn.glm2,test="Chi") 

Analysis  of  Deviance  Table 

#Model  1 :  status  ~  babr  +  ceer  +  pae  +  eas  +  aas  +  fas  +  m. degree  +  f.degree  + 

#  cm2  +  co2  +  ch2  +  apse  +  mpsc  +  ppsc  +  typ.ath 

#Model  2:  status  ~  babr  +  ceer  +  pae  +  eas  +  aas  +  fas  +  m. degree  +  f.degree  + 

#  cm2  +  co2  +  ch2  +  apse  +  mpsc  +  ppsc  +  typ.ath 

#  Resid.  Df  Resid.  Dev  Df  Deviance 

1  1371  1735.9 

2  1371  1737.2  0  -1.3125 

#  first  -glml  (log-log)  is  best,  by  (lower)  deviance 

#  stepwise  on  rtn.glml 

step.rtnl  =step(rtn.  glml  ,trace=FALSE) 

summary(step.rtn1 )  #stepwise  glml  is  better  than  glml  and  glm2  by  AIC 

#  compare  stepwise  glml  with  glm3-interaction 
rtn.glm3=glm(status~.A2,family=binomial,data=rtn) 
summary(rtn.glm3) 

#  generated  higher  AIC.  no  good,  Proceed  with  glml  (stepwise) 

#2.  Use  GAM  and  termplot  to  determine  if  transformation  of  a 

#  predictor  is  needed. 

library(gam) 

rtn.gam1=gam(status~typ.ath+m.degree+f.degree+s(cm2)+s(co2)+s(ch2) 

+s(ceer)+s(pae)+s(eas)+s(aas)+s(fas)+s(apsc)+s(mpsc)+s(ppsc)+babr, 

family=binomial,data=rtn) 

rtn.gam2=gam(status~s(aas)+babr,family=binomial,data=rtn) 

par(mfrow=c(2,1)) 

termplot(rtn.gam2, partial. resid=TRUE, col. res="dark  green") 

###  no  transformations  needed!!!! 
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#4(a  and  b).  dropl  pvals<.05  are  significant  and  should  be  kept  in  model 
drop1(step.rtn1,test="Chisq")  #  confirmed  babr  IN  and  aas  should  be  kept  in 
model 

#5.  Confusion  Matrix 

##glm1 

y<-rtn$status 

pi.hat=predict(step.rtn1  ,type="response") 

#head(pi.hat)  #  these  are  with  old  data 
#head(pi.hat>.5) 

tbl  1  =table(y,  pi.hat>.62  )  ;tbl1 

#  y  FALSE  TRUE 

#  0  284  222  #  Loss:  1 1 2  classified  correctly,  394  incorrectly; 

#  1  307  590  #  Actv:  83  classified  incorrectly  and  814  classified  correctly. 

#  There  are  more  personnel  Active  Duty  so  they  are  being  classified  better. 

ql  =tbl  1  [1  ]/sum(tbl  1  [1  ],tbl  1  [3]) 
q2=tbl1  [3]/sum(tbl1  [1  ],tbl1  [3]) 
q3=tbl1  [2]/sum(tbl1  [2],tbl1  [4]) 
q4=tbl1  [4]/sum(tbl1  [2],tbl1  [4]) 
ptbl1=data.frame(q1,q2,q3,q4);ptbl1 

mtx=matrix(data=ptbl1  ,nrow=2,ncol=2,byrow=TRUE,dimnames  =  list(c("0",  "1"), 
cfFALSE",  "TRUE")));  mtx 
(ql  +q4)*100;(q2+q3)*100 

#  .62  Best  Class  rate!! 

FALSE  TRUE 

0  0.5612648  0.4387352 
1  0.3422520  0.657748 
[1]  121.9013 
[1]  78.09871 

##6(a)  Misclassification  Rate 
z=y 

z!=  (pi.hat>.62)  #  trues  if  misclassified,  falses  if  classified  correctly 
sum(z!=  (pi.hat>.62)) 

#  529  #  number  of  misclassifications 

mean(z!=  (pi.hat>.62)) 

#  0.3770492 

##7(a).  Cross-validation 
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library(boot) 

cost<-function(y,pi.hat)  mean(y!=(pi.hat>.62)) 
cv.glm(rtn,step.rtn1  ,cost,K=10)$delta 
#  0.3877406  0.384321 0  #our  cv  estimate 

D.  LOGISTIC  REGRESSION  MODEL  PLOTS 


-0.5  1.0  0.5  2.0 


status 

i  i  i 

:!• 

& 

log(rrpsc) 

>o  ^ 
0 

c 

i  i 

> 

O  O 

log(ppsc) 

mm 

oo 

1.0  2.0  3.0 


7 - 0 - C 

A  1  l  1  A, 

- 

s _ a _ c 

!  )  ! 

c 

2 _ 0 _ £2 

|  I  1 

o 

o 

i 

_ □ _ 

i  !  ! 

3  0 
_ Q 

_ □ _ 

1 1^- 

O  0 

Q 

ch2 

imnmiiciLcc 

XOOQQOOOOOOOQC 

xmmmm 

X03CCOTCC00002 

xnxttyyyxjcax 


typ.ath 

3 - □ - C 

>  O  C 

1_ 0 £ 


3  0  0 

3_ Q_ 2 


f. degree 


O 

B—  CO 


o 

CN 


«y: 


Figure  16.  Pairs  Plot  (Graduation  “Status”  versus  Separation) 
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